How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)?

后端 未结 8 2176
生来不讨喜
生来不讨喜 2020-11-21 05:37

I have a file like the following and I would like to print the lines between two given patterns PAT1 and PAT2.



        
相关标签:
8条回答
  • What about the classic sed solution?

    Print lines between PAT1 and PAT2 - include PAT1 and PAT2

    sed -n '/PAT1/,/PAT2/p' FILE
    

    Print lines between PAT1 and PAT2 - exclude PAT1 and PAT2

    GNU sed
    sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p}}' FILE
    
    Any sed1
    sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p;};}' FILE
    

    or even (Thanks Sundeep):

    GNU sed
    sed -n '/PAT1/,/PAT2/{//!p}' FILE
    
    Any sed
    sed -n '/PAT1/,/PAT2/{//!p;}' FILE
    

    Print lines between PAT1 and PAT2 - include PAT1 but not PAT2

    The following includes just the range start:

    GNU sed
    sed -n '/PAT1/,/PAT2/{/PAT2/!p}' FILE
    
    Any sed
    sed -n '/PAT1/,/PAT2/{/PAT2/!p;}' FILE
    

    Print lines between PAT1 and PAT2 - include PAT2 but not PAT1

    The following includes just the range end:

    GNU sed
    sed -n '/PAT1/,/PAT2/{/PAT1/!p}' FILE
    
    Any sed
    sed -n '/PAT1/,/PAT2/{/PAT1/!p;}' FILE
    

    1 Note about BSD/Mac OS X sed

    A command like this here:

    sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p}}' FILE
    

    Would emit an error:

    ▶ sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p}}' FILE
    sed: 1: "/PAT1/,/PAT2/{/PAT1/!{/ ...": extra characters at the end of p command
    

    For this reason this answer has been edited to include BSD and GNU versions of the one-liners.

    0 讨论(0)
  • 2020-11-21 05:54

    You can do what you want with sed by suppressing the normal printing of pattern space with -n. For instance to include the patterns in the result you can do:

    $ sed -n '/PAT1/,/PAT2/p' filename
    PAT1
    3    - first block
    4
    PAT2
    PAT1
    7    - second block
    PAT2
    PAT1
    10    - third block
    

    To exclude the patterns and just print what is between them:

    $ sed -n '/PAT1/,/PAT2/{/PAT1/{n};/PAT2/{d};p}' filename
    3    - first block
    4
    7    - second block
    10    - third block
    

    Which breaks down as

    • sed -n '/PAT1/,/PAT2/ - locate the range between PAT1 and PAT2 and suppress printing;

    • /PAT1/{n}; - if it matches PAT1 move to n (next) line;

    • /PAT2/{d}; - if it matches PAT2 delete line;

    • p - print all lines that fell within /PAT1/,/PAT2/ and were not skipped or deleted.

    0 讨论(0)
  • 2020-11-21 05:55

    Alternatively:

    sed '/START/,/END/!d;//d'
    

    This deletes all lines except for those between and including START and END, then the //d deletes the START and END lines since // causes sed to use the previous patterns.

    0 讨论(0)
  • 2020-11-21 05:55

    This is like a foot-note to the 2 top answers above (awk & sed). I needed to run it on a large number of files, and hence performance was important. I put the 2 answers to a load-test of 10000 times:

    sedTester.sh

    for i in `seq 10000`;do sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p;};}' patternTester >> sedTesterOutput; done
    

    awkTester.sh

     for i in `seq 10000`;do awk '/PAT1/{flag=1; next} /PAT2/{flag=0} flag' patternTester >> awkTesterOutput; done
    

    Here are the results:

    zsh sedTester.sh  11.89s user 39.63s system 81% cpu 1:02.96 total
    zsh awkTester.sh  38.73s user 60.64s system 79% cpu 2:04.83 total
    

    sed solutions seems to be twice as fast as the awk solution (Mac OS).

    0 讨论(0)
  • 2020-11-21 05:56

    Here is another approach

    Include both patterns (default)

    $ awk '/PAT1/,/PAT2/' file
    PAT1
    3    - first block
    4
    PAT2
    PAT1
    7    - second block
    PAT2
    PAT1
    10    - third block
    

    Mask both patterns

    $ awk '/PAT1/,/PAT2/{if(/PAT2|PAT1/) next; print}' file
    3    - first block
    4
    7    - second block
    10    - third block
    

    Mask start pattern

    $ awk '/PAT1/,/PAT2/{if(/PAT1/) next; print}' file
    3    - first block
    4
    PAT2
    7    - second block
    PAT2
    10    - third block
    

    Mask end pattern

    $ awk '/PAT1/,/PAT2/{if(/PAT2/) next; print}' file
    PAT1
    3    - first block
    4
    PAT1
    7    - second block
    PAT1
    10    - third block
    
    0 讨论(0)
  • 2020-11-21 06:00

    For completeness, here is a Perl solution:

    Print lines between PAT1 and PAT2 - include PAT1 and PAT2

    perl -ne '/PAT1/../PAT2/ and print' FILE
    

    or:

    perl -ne 'print if /PAT1/../PAT2/' FILE
    

    Print lines between PAT1 and PAT2 - exclude PAT1 and PAT2

    perl -ne '/PAT1/../PAT2/ and !/PAT1/ and !/PAT2/ and print' FILE
    

    or:

    perl -ne 'if (/PAT1/../PAT2/) {print unless /PAT1/ or /PAT2/}' FILE 
    

    Print lines between PAT1 and PAT2 - exclude PAT1 only

    perl -ne '/PAT1/../PAT2/ and !/PAT1/ and print' FILE
    

    Print lines between PAT1 and PAT2 - exclude PAT2 only

    perl -ne '/PAT1/../PAT2/ and !/PAT2/ and print' FILE
    

    See also:

    • Range operator section in perldoc perlop for more on the /PAT1/../PAT2/ grammar:

    Range operator

    ...In scalar context, ".." returns a boolean value. The operator is bistable, like a flip-flop, and emulates the line-range (comma) operator of sed, awk, and various editors.

    • For the -n option, see perldoc perlrun, which makes Perl behave like sed -n.

    • Perl Cookbook, 6.8 for a detailed discussion of extracting a range of lines.

    0 讨论(0)
提交回复
热议问题