Classify genomic data and write out files, python, awk, R data.table speed PK

Because genome data is too large, I want to use R language to deal with the system memory shortage, so I want to split the file by chromosome, and find that python, awk, R language can be implemented very simply and quickly. Is there a gap in speed? So before running several large 50G files, I use 244MB data to test the scripts and speed them. ...

Posted by toyartist on Fri, 21 Dec 2018 15:48:05 -0800