Remove line breaks in a FASTA file

前端 未结 9 1277
予麋鹿
予麋鹿 2020-12-05 01:26

I have a fasta file where the sequences are broken up with newlines. I\'d like to remove the newlines. Here\'s an example of my file:

>accession1
ATGGCC         


        
9条回答
  •  既然无缘
    2020-12-05 02:01

    The accepted solution is fine, but it's not particularly AWKish. Consider using this instead:

     awk '/^>/ { print (NR==1 ? "" : RS) $0; next } { printf "%s", $0 } END { printf RS }' file
    

    Explanation:

    For lines beginning with >, print the line. A ternary operator is used to print a leading newline character if the line is not the first in the file. For lines not beginning with >, print the line without a trailing newline character. Since the last line in the file won't begin with >, use the END block to print a final newline character.

    Note that the above can also be written more briefly, by setting a null output record separator, enabling default printing and re-assigning lines beginning with >. Try:

    awk -v ORS= '/^>/ { $0 = (NR==1 ? "" : RS) $0 RS } END { printf RS }1' file
    

提交回复
热议问题