问题
Using grep
, I can print all occurrences of the uppercase letter "Z" in my document. The output, however, will display the entire lines in which every "Z" in the document was found. I need to limit this to printing only the 10 letters appearing before every occurance of "Z".
E.g., if the document has a line with "AAAABBBBBBBBBCCCCCCDDDDDDDZ", it will print "CCDDDDDDD", the 10 letters appearing before.
- If there are fewer than 10 letters prior to "Z", then nothing needs to be printed.
- If "Z" appears multiple times in a single line, the 10 letters preceding each of these "Z"'s should be printed, e.g.: "AAAABBBBBBBBBZCCCCCDDDDDDDZ" will print "ABBBBBBBBB" and "CCDDDDDDDZ".
The result will be an output list of these letters, e.g.:
ABBBBBBBBB
CCDDDDDDDZ
How can I print the 10 letters preceding every occurrence of the letter "Z" in my document?
回答1:
Simple:
grep -oP '.{10}(?=Z)' <<< AAAABBBBBBBBBZCCCCCDDDDDDDZ
Explanation:
-o : Print only match, not entire line
-P : Use PCRE / Perl regex
.{10} : Match is any 10 characters,
(?=z) : which are followed by "Z". (Search for positive look-ahead for more details)
<<< ...: Here string
EDIT:
NOTE: This does not work, if the 10 characters we want are overlapping. e.g. input=AAAABBBBBBBBBZDDDDDDDZ. If the input contains such pattern, see igegami's answer
回答2:
$ perl -nE'say for /(?<=(.{10}))Z/g' <<'__EOI__'
AAAABBBBBBBBBZCCCCCDDDDDDDZ
AAAABBBBBBBBBZDDDDDDDZ
__EOI__
ABBBBBBBBB
CCCDDDDDDD
ABBBBBBBBB
BBZDDDDDDD
or
$ perl -nE'say for /(?=(.{10})Z)/g' <<'__EOI__'
AAAABBBBBBBBBZCCCCCDDDDDDDZ
AAAABBBBBBBBBZDDDDDDDZ
__EOI__
ABBBBBBBBB
CCCDDDDDDD
ABBBBBBBBB
BBZDDDDDDD
来源:https://stackoverflow.com/questions/17773501/how-to-print-10-letters-preceding-every-occurrence-of-a-particular-character