grep (bash) multi-line pattern

后端 未结 6 452
感情败类
感情败类 2021-01-15 01:14

In bash (4.3.46(1)) I have some multi-line so called fasta records where each record is initiated by on line with >name and the following lines DNA sequence ([AGCTNacgtn]),

6条回答
  •  旧时难觅i
    2021-01-15 02:07

    The best tool for working with multi-line records is awk.

    In your case:

    awk 'BEGIN{RS=">"} NR==2 {print RS$0}' input.txt
    

    input.txt

    >chr1
    AGCTACTTTT
    AGGGNGGTNN
    >chr2
    TTGNACACCC
    TGGGGGAGTA
    >chr3
    TGACGTGGGT
    TCGGGTTTTT
    

    Explanation:

    BEGIN{RS=">"} Initially set record separator to ">"

    NR==2 filter for record #2 only

    {print RS$0} print record #2 with the missing record separator back

提交回复
热议问题