Print lines in one file matching patterns in another file

后端 未结 5 961
-上瘾入骨i
-上瘾入骨i 2020-11-29 06:45

I have a file with more than 40.000 lines (file1) and I want to extract the lines matching patterns in file2 (about 6000 lines). I use grep like this, but it is very slow: <

5条回答
  •  生来不讨喜
    2020-11-29 07:12

    Just for fun, here's a Perl version:

    #!/usr/bin/perl
    use strict;
    use warnings;
    my %patterns;
    my $srch;
    
    # Open file and get patterns to search for
    open(my $fh2,"<","file2")|| die "ERROR: Could not open file2";
    while (<$fh2>)
    {
       chop;
       $patterns{$_}=1;
    }
    
    # Now read data file
    open(my $fh1,"<","file1")|| die "ERROR: Could not open file1";
    while (<$fh1>)
    {
       (undef,$srch,undef)=split;
       print $_ if defined $patterns{$srch};
    }
    

    Here are some timings, using a 60,000 line file1 and 6,000 line file2 per Ed's file creation method:

    time awk 'NR==FNR{pats[$0]; next} $2 in pats' file2 file1 > out
    real    0m0.202s
    user    0m0.197s
    sys     0m0.005s
    
    time ./go.pl > out2
    real    0m0.083s
    user    0m0.079s
    sys     0m0.004s
    

提交回复
热议问题