How to check if a list of strings are present in two separate files

I have two files, "File A" is a list of IP Addresses with corresponding MAC addresses on the same line. "File B" is a list of only MAC addresses. I need to compare the two files and list the lines from File A that do not have MAC addresses found in File B.

FILE A:

172.0.0.1 AA:BB:CC:DD:EE:01
172.0.0.2 AA:BB:CC:DD:EE:02
172.0.0.3 AA:BB:CC:DD:EE:03

FILE B:

AA:BB:CC:DD:EE:01
AA:BB:CC:DD:EE:02

So the output should be:

172.0.0.3 AA:BB:CC:DD:EE:03

I am looking for solutions in sed, awk, grep, python or really anything that give me the file I want.

Does your input really have a dollar sign at the start of every line, or is that a formatting quirk of your question? If you can get rid of the dollar signs, then you can use this:

fgrep -v -f fileb filea

with open('filea','r') as fa:    
    with open('fileb','r') as f:
        MACS=set(line.strip() for line in f)

    for line in fa:
        IP,MAC=line.split()
        if MAC not in MACS:
            print (line.strip())

#!/usr/bin/env python
with open('fileb') as fileb, open('filea') as filea:
    macs = set(map(str.strip, fileb))
    for line in filea:
        ip_mac = line.split()
        if len(ip_mac) == 2 and ip_mac[1] not in macs:
           print(" ".join(ip_mac))

Python:

macs = set(line.strip() for line in open('fileb'))
with open('filea') as ips:
    for line in ips:
        ip,mac = line.split()
        if mac not in macs:
            print line

EDIT: OK so everyone posted the same python answer. I reach for python first too but gawk at this:

awk 'NR == FNR {fileb[$1];next} !($2 in fileb)' fileb filea

EDIT2: OP removed the leading $ from the lines so python and awk change and fgrep comes out to play.

fgrep -v -f fileb filea

with open(FILEB) as file1,open(FILEA) as file2:
file1={mac.strip() for mac in file1}
file2={line.split()[1]:line.split()[0] for line in file2}
    for x in file2:
        if x not in file1:
            print("{0} {1}".format(file2[x],x))

output:

172.0.0.2 AA:BB:CC:DD:EE:05
172.0.0.4 AA:BB:CC:DD:EE:06
172.0.0.6 AA:BB:CC:DD:EE:03
172.0.0.66 AA:BB:CC:DD:EE:0E

One way using awk. It saves MACs from fileB in an array and for each second field of fileA check it in the array and only print when not found.

awk '
    FNR == NR {
        data[ $0 ] = 1;
        next;
    }
    NFR < NR && !($2 in data)
' fileB fileA

Output:

172.0.0.3 AA:BB:CC:DD:EE:03

Python is easiest. Read File B into a dictionary, then go through File A and look for a match in the dictionary.

I could whip up a Java example that you could translate to whatever language you want

import java.io.*;
import java.util.*;
class Macs {
    public static void main(String...args)throws Exception {
        Set<String> macs = loadLines("macs.txt");
        Set<String> ips = loadLines("ips.txt");

        for(String raw : ips) {
            String[] tokens = raw.split("\\s"); // by space
            String ip = tokens[0];
            String mac = tokens[1];
            if(!macs.contains(mac))
                System.out.println(raw);
        } 
    }

    static Set<String> loadLines(String filename) throws Exception {
        Scanner sc = new Scanner(new File(filename));
        Set<String> lines = new HashSet<String>();
        while(sc.hasNextLine()) {
            // substring(1) removes leading $
            lines.add(sc.nextLine().substring(1).toLowerCase());
        }
        return lines;
    }
}

Redirecting this output to a file will give you your result.

With the following input file of

macs.txt

$AA:BB:CC:DD:EE:01
$AA:BB:CC:DD:EE:02
$AA:BB:CF:DD:EE:09
$AA:EE:CF:DD:EE:09

ips.txt

$172.0.0.1 AA:BB:CC:DD:EE:01
$172.0.0.2 AA:BB:CC:DD:EE:02
$172.0.0.2 AA:BB:CC:DD:EE:05
$172.0.0.66 AA:BB:CC:DD:EE:0E
$172.0.0.4 AA:BB:CC:DD:EE:06
$172.0.0.5 AA:BB:CF:DD:EE:09
$172.0.0.6 AA:BB:CC:DD:EE:03

Result:

c:\files\j>java Macs
172.0.0.6 aa:bb:cc:dd:ee:03
172.0.0.66 aa:bb:cc:dd:ee:0e
172.0.0.2 aa:bb:cc:dd:ee:05
172.0.0.4 aa:bb:cc:dd:ee:06

This might work for you (GUN sed);

sed 's|.*|/&/Id|' fileb | sed -f - filea

来源：https://stackoverflow.com/questions/11418438/how-to-check-if-a-list-of-strings-are-present-in-two-separate-files

标签

python

text

sed

awk

compare