问题
I want the words after "test" word from a line in a file. means in actuaaly, i dont want the words coming before "test" word.
thats the pattern...
e.g:
Input:
***This is a*** test page.
***My*** test work of test is complete.
Output:
test page.
work of test is complete.
回答1:
Using sed:
sed -n 's/^.*test/test/p' input
If you want to print non-matching lines, untouched:
sed 's/^.*test/test/' input
The one above will remove (greedily) all text until the last test on a line. If you want to delete up to the first test use potong's suggestion:
sed -n 's/test/&\n/;s/.*\n//p' input
回答2:
A pure bash one-liner:
while read x; do [[ $x =~ test.* ]] && echo ${BASH_REMATCH[0]}; done <infile
Input: infile
This is a test page.
My test work of test is complete.
Output:
test page.
test work of test is complete.
It reads all lines from file infile, checks if the line contains the string test and then prints the rest of the line (including test).
The same in sed:
sed 's/.(test.)/\1/' infile (Oops! This is wrong! .* is greedy, so it cuts too much from the 2nd example line). This works well:
sed -e 's/\(test.*\)/\x03&/' -e 's/.*\x03//' infile
I did some speed testing (for the original (wrong) sed version). The result is that for small files the bash solution performs better. For larger files sed is better. I also tried this awk version, which is even better for big files:
awk 'match($0,"test.*"){print substr($0,RSTART)}' infile
Similar in perl:
perl -ne 's/(.*?)(test.*)/$2/ and print' infile
I used the two lines example input file and I duplicated it every time. Every version run 1000 times. The result is:
Size | bash | sed | awk | perl
[B] | [sec] | [sec] | [sec] | [sec]
------------------------------------------
55 | 0.420 | 10.510 | 10.900 | 17.911
110 | 0.460 | 10.491 | 10.761 | 17.901
220 | 0.800 | 10.451 | 10.730 | 17.901
440 | 1.780 | 10.511 | 10.741 | 17.871
880 | 4.030 | 10.671 | 10.771 | 17.951
1760 | 8.600 | 10.901 | 10.840 | 18.011
3520 | 17.691 | 11.460 | 10.991 | 18.181
7040 | 36.042 | 12.401 | 11.300 | 18.491
14080 | 72.355 | 14.461 | 11.861 | 19.161
28160 |145.950 | 18.621 | 12.981 | 20.451
56320 | | | 15.132 | 23.022
112640 | | | 19.763 | 28.402
225280 | | | 29.113 | 39.203
450560 | | | 47.634 | 60.652
901120 | | | 85.047 |103.997
来源:https://stackoverflow.com/questions/16892013/how-to-remove-words-of-a-line-upto-specific-character-pattern-regex