问题
I want the words after "test" word from a line in a file. means in actuaaly, i dont want the words coming before "test" word.
thats the pattern...
e.g:
Input:
***This is a*** test page.
***My*** test work of test is complete.
Output:
test page.
work of test is complete.
回答1:
Using sed:
sed -n 's/^.*test/test/p' input
If you want to print non-matching lines, untouched:
sed 's/^.*test/test/' input
The one above will remove (greedily) all text until the last test
on a line. If you want to delete up to the first test use potong's suggestion:
sed -n 's/test/&\n/;s/.*\n//p' input
回答2:
A pure bash one-liner:
while read x; do [[ $x =~ test.* ]] && echo ${BASH_REMATCH[0]}; done <infile
Input: infile
This is a test page.
My test work of test is complete.
Output:
test page.
test work of test is complete.
It reads all lines from file infile
, checks if the line contains the string test
and then prints the rest of the line (including test
).
The same in sed:
sed 's/.(test.)/\1/' infile (Oops! This is wrong! .*
is greedy, so it cuts too much from the 2nd example line). This works well:
sed -e 's/\(test.*\)/\x03&/' -e 's/.*\x03//' infile
I did some speed testing (for the original (wrong) sed version). The result is that for small files the bash solution performs better. For larger files sed is better. I also tried this awk version, which is even better for big files:
awk 'match($0,"test.*"){print substr($0,RSTART)}' infile
Similar in perl:
perl -ne 's/(.*?)(test.*)/$2/ and print' infile
I used the two lines example input file and I duplicated it every time. Every version run 1000 times. The result is:
Size | bash | sed | awk | perl
[B] | [sec] | [sec] | [sec] | [sec]
------------------------------------------
55 | 0.420 | 10.510 | 10.900 | 17.911
110 | 0.460 | 10.491 | 10.761 | 17.901
220 | 0.800 | 10.451 | 10.730 | 17.901
440 | 1.780 | 10.511 | 10.741 | 17.871
880 | 4.030 | 10.671 | 10.771 | 17.951
1760 | 8.600 | 10.901 | 10.840 | 18.011
3520 | 17.691 | 11.460 | 10.991 | 18.181
7040 | 36.042 | 12.401 | 11.300 | 18.491
14080 | 72.355 | 14.461 | 11.861 | 19.161
28160 |145.950 | 18.621 | 12.981 | 20.451
56320 | | | 15.132 | 23.022
112640 | | | 19.763 | 28.402
225280 | | | 29.113 | 39.203
450560 | | | 47.634 | 60.652
901120 | | | 85.047 |103.997
来源:https://stackoverflow.com/questions/16892013/how-to-remove-words-of-a-line-upto-specific-character-pattern-regex