问题
I have two output files:
- FILE-A contains 70,000+ unique entries.
- FILE-B contains a unique listing that I need to remove from FILE-B.
FILE-A:
TOM
JACK
AILEY
BORG
ROSE
ELI
FILE-B Content:
TOM
ELI
I want to remove anything listed in FILE-B from File-A.
FILE-C (Result file):
JACK
AILEY
BORG
ROSE
I assume I need a while r for i
statement. Can someone help me with this?
I need to cat
and read FILE-A and for every line in FILE-B I need to remove that from FILE-A.
What command should I use?
回答1:
You don't need either awk
, sed
, or a loop. You just need grep
:
fgrep -vxf FILE-B FILE-A
Please note the use of -x
to match entries exactly.
Output:
JACK
AILEY
BORG
ROSE
回答2:
You can use grep -v -f
:
grep -xFvf FILE-B FILE-A
ACK
AILEY
BORG
ROSE
回答3:
If you start with sorted input, the tool for this task is comm
comm -23 FILE-A FILE-B
the option argument means
-2 suppress lines unique to FILE-B
-3 suppress lines that appear in both files
if not sorted initially, you can do the following
comm -23 <(sort FILE-A) <(sort FILE-B)
回答4:
You don't need any loop, single awk
or sed
command is enough:
awk:
awk 'FNR==NR {a[$0];next} !($0 in a)' FILE-B FILE-A >FILE-C
sed:
sed "s=^=/^=;s=$=$/d=" FILE-B | sed -f- FILE-A >FILE-C
Note:
- While the sed version works for the data shown, it won't handle any text in FILE-B which can be interpreted as a regex pattern.
- The awk solution reads FILE-B entirely into memory. It doesn't have the limitation of interpreting text as regex like the
sed
solution.
来源:https://stackoverflow.com/questions/31483597/use-awk-sed-command-and-while-loop-to-remove-entries-from-second-file