Print all lines between two patterns, exclusive, first instance only (in sed, AWK or Perl) [duplicate]

白昼怎懂夜的黑 提交于 2021-02-07 09:33:34

问题


Using sed, AWK (or Perl), how do you print all lines between (the first instance of) two patterns, exclusive of the patterns?1

That is, given as input:

aaa
PATTERN1
bbb
ccc
ddd
PATTERN2
eee

Or possibly even:

aaa
PATTERN1
bbb
ccc
ddd
PATTERN2
eee
fff
PATTERN1
ggg
hhh
iii
PATTERN2
jjj

I would expect, in both cases:

bbb
ccc
ddd

1A number of users voted to close this question as a duplicate of this one. In the end, I provided a gist that proves they are different. The question is also superficially similar to a number of others, but there is no exact match, and none of them are of high quality, and, as I believe that this specific problem is the one most commonly faced, it deserves a clear formulation, and a set of correct, clear answers.


回答1:


With awk (assumes that PATTERN1 and PATTERN2 are always present in pairs and either of them do not occur inside a pair)

$ cat ip.txt
aaa
PATTERN1
bbb
ccc
ddd
PATTERN2
eee
fff
PATTERN1
ggg
hhh
iii
PATTERN2
jjj

$ awk '/PATTERN2/{exit} f; /PATTERN1/{f=1}' ip.txt
bbb
ccc
ddd
  • /PATTERN1/{f=1} set flag if /PATTERN1/ is matched
  • /PATTERN2/{exit} exit if /PATTERN2/ is matched
  • f; print input line if flag is set


Generic solution, where the block required can be specified

$ awk -v b=1 '/PATTERN2/ && c==b{exit} c==b; /PATTERN1/{c++}' ip.txt
bbb
ccc
ddd
$ awk -v b=2 '/PATTERN2/ && c==b{exit} c==b; /PATTERN1/{c++}' ip.txt
2
46



回答2:


If you have GNU sed (tested using version 4.7 on Mac OS X), the simplest solution could be:

sed '0,/PATTERN1/d;/PATTERN2/Q'

Explanation:

  • The d command deletes from line 1 to the line matching /PATTERN1/ inclusive.
  • The Q command then exits without printing on the first line matching /PATTERN2/.

If the file has only once instance of the pattern, or if you don't mind extracting all of them, and you want a solution that doesn't depend on a GNU extension, this works:

sed -n '/PATTERN1/,/PATTERN2/{//!p}'

Explanation:

  • Note that the empty regular expression // repeats the last regular expression match.



回答3:


This might work for you (GNU sed);

sed -n '/PATTERN1/{:a;n;/PATTERN2/q;p;$!ba}' file

This prints only the lines between the first set of delimiters, or if the second delimiter does not exist, to the end of the file.




回答4:


I attempted twice to answer, but the questions switched hold/duplicate statuses..

Borrowing input from @Sundeep and adding the answer which I shared in the question comments.

Using awk

awk -v x=0 -v y=1 ' /PATTERN1/&&y { x=1;next } /PATTERN2/&&y { x=0;y=0; next } x ' file

with Perl

perl -0777 -ne ' while( /PATTERN1.*?\n(.+?)^[^\n]*?PATTERN2/msg ) { print $1 if $x++ <1 } '

Results:

$ cat ip.txt
aaa
PATTERN1
bbb
ccc
ddd
PATTERN2
eee
PATTERN1
2
46
PATTERN2
xyz

$

$ awk -v x=0 -v y=1 ' /PATTERN1/&&y { x=1;next } /PATTERN2/&&y { x=0;y=0; next } x ' ip.txt
bbb
ccc
ddd

$ perl -0777 -ne ' while( /PATTERN1.*?\n(.+?)^[^\n]*?PATTERN2/msg ) { print $1 if $x++ <1 } ' ip.txt
bbb
ccc
ddd

$

To make it generic

awk here y is the input

awk -v x=0 -v y=2 ' /PATTERN1/ { x++;next } /PATTERN2/ { if(x==y) exit } x==y ' ip.txt
2
46

perl check ++$x against the occurence.. here it is 2

perl -0777 -ne ' while( /PATTERN1.*?\n(.+?)^[^\n]*?PATTERN2/msg ) { print $1 if ++$x==2 } ' ip.txt
2
46



回答5:


Adding more solutions(possible ways here, for fun :) and not at all claiming that these are better than usual ones) All tested and written in GNU awk. Also tested with given examples only.

1st Solution:

awk -v RS="" -v FS="PATTERN2" -v ORS="" '$1 ~ /\nPATTERN1\n/{sub(/.*PATTERN1\n/,"",$1);print $1}' Input_file

2nd solution:

awk -v RS="" -v ORS="" 'match($0,/PATTERN1[^(PATTERN2)]*/){val=substr($0,RSTART,RLENGTH);gsub(/^PATTERN1\n|^$\n/,"",val);print val}' Input_file

3rd solution:

awk -v RS="" -v OFS="\n" -v ORS="" 'sub(/PATTERN2.*/,"") && sub(/.*PATTERN1/,"PATTERN1"){$1=$1;sub(/^PATTERN1\n/,"")} 1' Input_file

In all above codes output will be as follows.

bbb
ccc
ddd



回答6:


Using GNU sed:

sed -nE '/PATTERN1/{:s n;/PATTERN2/q;p;bs}'

-n will prune all but lines between PATTERN1 and PATTERN2 including both, because there will be p printout command. every sed range check if it's true will execute only one the next, so {} grouping is mandated.. Drop PATTERN1 by n command (means next), if reach the first PATTERN2 outrightly quit otherwise print the line then and continue the next line within that boundary.



来源:https://stackoverflow.com/questions/55220417/print-all-lines-between-two-patterns-exclusive-first-instance-only-in-sed-aw

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!