问题
I have a sed command which will successfully print lines matching two patterns:
sed -n '/PAGE 2/,/\x0c/p' filename.txt
What I haven't figured out, is that I want it to print all the lines from the first token, up until the second token. The \x0c token is a record separator on a big flat file, and I need to keep THAT line intact.
In between the two tokens, the data is completely variable, and I do not have a reliable anchor to work with.
[CLARIFICATION]
Right now it prints all the lines between /PAGE 2/ and /\x0c/ inclusive. I want it to print /PAGE 2/ up until the next /\x0c/ in the record.
[test data] The /x0c will be at the start of the first line, and the beginning of the last line of this record.
I need to delete the first line of the record, through the line just before the beginning of the next record.
^L20-SEP-2006 01:54:08 PM Foobars College PAGE 2
TERM: 200610 Student Billing Statement SUMDATA
99999
Foo bar R0000000
999 Geese Rural Drive DUE: 15-OCT-2012
Columbus, NE 90210
--------------------------------------------------------------------------------
Balance equal to or greater than $5000.00 $200.00
Billing inquiries may be directed to 444/555-1212 or by
email to bursar@foobar.edu. Financial Aid inquiries should
be directed to 444/555-1212 or finaid@foobar.edu.
^L20-SEP-2006 01:54:08 PM Foobars College PAGE 1
[expected result]
^L20-SEP-2006 01:54:08 PM Foobars College PAGE 1
There will be multiple such records in the file. I can rely only on the /PAGE 2/ token, and the /x0c/ token.
[solution]:
Following Choruba's lead, I edited his command to:
sed '/PAGE [2-9]/,/\x0c/{/\x0c$/!d}'
The rule in the curly brackets was applying itself to any line containing a ^L and was selectively ignoring them.
回答1:
EDIT: New answer for the new question the OP asked (how to delete records:
Given a file with control-Ls delimiting records and a desire to print specific lines from specific records, just set your record separator to control-L and your field separator to "\n" and print whatever you like. For example, to get the output the OP says he wants from the input he posted would just be:
awk -v RS='^L' -F'\n' 'NR==3{print $1}' file
^L shown here represents a literal control-L, and it's the 3rd record because there's an empty record before te first control-L in the input file.
#This is the answer to the original question the OP asked:
You want this:
awk '/PAGE 2/ {f=1} /\x0c/{f=0} f' file
but also try these to see the difference (for the future):
awk '/PAGE 2/ {f=1} f; /\x0c/{f=0}' file
awk 'f; /PAGE 2/ {f=1} /\x0c/{f=0}' file
And finally, FYI, The following idioms describe how to select a range of records given a specific pattern to match:
a) Print all records from some pattern:
awk '/pattern/{f=1}f' file
b) Print all records after some pattern:
awk 'f;/pattern/{f=1}' file
c) Print the Nth record after some pattern:
awk 'c&&!--c;/pattern/{c=N}' file
d) Print every record except the Nth record after some pattern:
awk 'c&&!--c{next}/pattern/{c=N}1' file
e) Print the N records after some pattern:
awk 'c&&c--;/pattern/{c=N}' file
f) Print every record except the N records after some pattern:
awk 'c&&c--{next}/pattern/{c=N}1' file
g) Print the N records from some pattern:
awk '/pattern/{c=N}c&&c--' file
I changed the variable name from "f" for "found" to "c" for "count" where appropriate as that's more expressive of what the variable actually IS.
回答2:
Tell sed not to print the line containing the character:
sed -n '/PAGE 2/,/\x0c/{/\x0c/!p}' filename.txt
回答3:
I think this would do it:
awk '/PAGE 2/{a=1}/\x0c/{a=0}{if(a)print}'
回答4:
In this line, the second sed deletes (d) the last line ($).
sed -n '/^START$/,/^STOP$/p' in.txt | sed '$d'
回答5:
Following Choruba's lead, I edited his command to:
sed '/PAGE [2-9]/,/\x0c/{/\x0c$/!d}'
来源:https://stackoverflow.com/questions/13177772/sed-or-awk-deleting-lines-between-pattern-matches-excluding-the-second-tokens