Use awk sed command and while loop to remove entries from second file

丶灬走出姿态 提交于 2019-12-10 18:26:34

问题


I have two output files:

  1. FILE-A contains 70,000+ unique entries.
  2. FILE-B contains a unique listing that I need to remove from FILE-B.

FILE-A:

 TOM
 JACK
 AILEY
 BORG
 ROSE
 ELI

FILE-B Content:

 TOM
 ELI

I want to remove anything listed in FILE-B from File-A.

FILE-C (Result file):

 JACK
 AILEY
 BORG
 ROSE

I assume I need a while r for i statement. Can someone help me with this? I need to cat and read FILE-A and for every line in FILE-B I need to remove that from FILE-A.

What command should I use?


回答1:


You don't need either awk, sed, or a loop. You just need grep:

fgrep -vxf FILE-B FILE-A

Please note the use of -x to match entries exactly.

Output:

JACK
AILEY
BORG
ROSE



回答2:


You can use grep -v -f:

grep -xFvf FILE-B FILE-A
ACK
AILEY
BORG
ROSE



回答3:


If you start with sorted input, the tool for this task is comm

comm -23 FILE-A FILE-B

the option argument means

-2              suppress lines unique to FILE-B
-3              suppress lines that appear in both files

if not sorted initially, you can do the following

comm -23 <(sort FILE-A) <(sort FILE-B)



回答4:


You don't need any loop, single awk or sed command is enough:

awk:

awk 'FNR==NR {a[$0];next} !($0 in a)' FILE-B FILE-A >FILE-C

sed:

sed "s=^=/^=;s=$=$/d=" FILE-B | sed -f- FILE-A >FILE-C

Note:

  1. While the sed version works for the data shown, it won't handle any text in FILE-B which can be interpreted as a regex pattern.
  2. The awk solution reads FILE-B entirely into memory. It doesn't have the limitation of interpreting text as regex like the sed solution.


来源:https://stackoverflow.com/questions/31483597/use-awk-sed-command-and-while-loop-to-remove-entries-from-second-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!