How to delete all the lines after the last occurence of pattern?

给你一囗甜甜゛ 提交于 2021-02-04 07:25:08

问题


i want to delete all the lines after the last occurence of pattern except the pattern itself

file.txt

honor
apple
redmi
nokia
apple
samsung
lg
htc

file.txt what i want

honor
apple
redmi
nokia
apple

what i have tried

sed -i '/apple/q' file.txt

this deletes all the line after the first occurence of pattern -

honor

回答1:


Simple, robust 2-pass approach using almost no memory:

$ awk 'NR==FNR{if (/apple/) hit=NR; next} {print} FNR==hit{exit}' file file
honor
apple
redmi
nokia
apple

If that doesn't execute fast enough THEN it's time to try some alternatives to see if any produce a performance improvement.




回答2:


Reverse the file, print everything starting from the first occurrence of the pattern, then reverse the result:

tac file.txt | sed -n '/apple/,$p' | tac > newfile.txt

You can find the line number of the last match, then use that to print the first N lines of the file:

line=$(awk '/apple/ { line=NR } END {print line}')
head -n $line file.txt > newfile.txt



回答3:


If you don't want to reverse the file as Barmar suggests, you will either have to read the file in reverse using lower level tools (eg, fseek) or read it twice:

sed $(awk '/apple/{a=NR}END{print a+1}' input),\$d input

(Note that if the pattern does not appear in the file, this will output nothing. That's an edge case you should worry about.)




回答4:


This might work for you (GNU sed):

sed '/apple/,$!b;//!H;//{x;//p;x;h};${x;P};d' file

Print as usual any lines that are not from the first appearance of apple to the end of the file. For lines within the above range, append lines that do not contain the word apple to the hold space (HS). Lines that do contain the word apple, first swap to the HS and print any line there if the word apple is there, then replace the HS with the line containing apple. Delete all lines other than the last line. At the endof file print the first line of the HS and delete the remaining lines.

If slurping a large file is not a problem use:

sed -rz 's/(.*apple[^\n]*).*/\1\n/' file

This uses greed to capture all lines before and including the word apple.




回答5:


here is another awk without scanning the file twice

$ awk 'f       {buf=buf ORS $0} 
       /apple/ {f=1; if(buf)print buf; buf=$0} 
       !f' file

honor
apple
redmi
nokia
apple



回答6:


If you don't mind having everything in memory, you can do:

$ awk '/^apple$/{last=NR} 
              {lines[NR]=$0}
     END{for(li=1;li<=last;li++) print lines[li]}' file
honor
apple
redmi
nokia
apple



回答7:


Given that you are dealing with large input I would go with a two-pass coreutils solution, e.g.:

n=$(grep -Fn apple infile | tail -n1 | cut -d: -f1)
[ -n "$n" ] && head -n$n infile > outfile

This uses grep's fixed string matching (-F) to find every line containing apples. Then head is used to extract the relevant lines.

You did not specify what happens when no apples are found, so this solution does nothing when that occurs.



来源:https://stackoverflow.com/questions/44308594/how-to-delete-all-the-lines-after-the-last-occurence-of-pattern

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!