Use awk sed command and while loop to remove entries from second file

问题

I have two output files:

FILE-A contains 70,000+ unique entries.
FILE-B contains a unique listing that I need to remove from FILE-B.

FILE-A:

 TOM
 JACK
 AILEY
 BORG
 ROSE
 ELI

FILE-B Content:

 TOM
 ELI

I want to remove anything listed in FILE-B from File-A.

FILE-C (Result file):

 JACK
 AILEY
 BORG
 ROSE

I assume I need a while r for i statement. Can someone help me with this? I need to cat and read FILE-A and for every line in FILE-B I need to remove that from FILE-A.

What command should I use?

回答1:

You don't need either awk, sed, or a loop. You just need grep:

fgrep -vxf FILE-B FILE-A

Please note the use of -x to match entries exactly.

Output:

JACK
AILEY
BORG
ROSE

回答2:

You can use grep -v -f:

grep -xFvf FILE-B FILE-A
ACK
AILEY
BORG
ROSE

回答3:

If you start with sorted input, the tool for this task is comm

comm -23 FILE-A FILE-B

the option argument means

-2              suppress lines unique to FILE-B
-3              suppress lines that appear in both files

if not sorted initially, you can do the following

comm -23 <(sort FILE-A) <(sort FILE-B)

回答4:

You don't need any loop, single awk or sed command is enough:

awk:

awk 'FNR==NR {a[$0];next} !($0 in a)' FILE-B FILE-A >FILE-C

sed:

sed "s=^=/^=;s=$=$/d=" FILE-B | sed -f- FILE-A >FILE-C

Note:

While the sed version works for the data shown, it won't handle any text in FILE-B which can be interpreted as a regex pattern.
The awk solution reads FILE-B entirely into memory. It doesn't have the limitation of interpreting text as regex like the sed solution.

来源：https://stackoverflow.com/questions/31483597/use-awk-sed-command-and-while-loop-to-remove-entries-from-second-file

标签

Linux

bash

awk

sed

while-loop