Deleting lines from one file which are in another file

后端 未结 9 661
余生分开走
余生分开走 2020-11-28 01:46

I have a file f1:

line1
line2
line3
line4
..
..

I want to delete all the lines which are in another file f2:

9条回答
  •  轻奢々
    轻奢々 (楼主)
    2020-11-28 02:13

    if you have Ruby (1.9+)

    #!/usr/bin/env ruby 
    b=File.read("file2").split
    open("file1").each do |x|
      x.chomp!
      puts x if !b.include?(x)
    end
    

    Which has O(N^2) complexity. If you want to care about performance, here's another version

    b=File.read("file2").split
    a=File.read("file1").split
    (a-b).each {|x| puts x}
    

    which uses a hash to effect the subtraction, so is complexity O(n) (size of a) + O(n) (size of b)

    here's a little benchmark, courtesy of user576875, but with 100K lines, of the above:

    $ for i in $(seq 1 100000); do echo "$i"; done|sort --random-sort > file1
    $ for i in $(seq 1 2 100000); do echo "$i"; done|sort --random-sort > file2
    $ time ruby test.rb > ruby.test
    
    real    0m0.639s
    user    0m0.554s
    sys     0m0.021s
    
    $time sort file1 file2|uniq -u  > sort.test
    
    real    0m2.311s
    user    0m1.959s
    sys     0m0.040s
    
    $ diff <(sort -n ruby.test) <(sort -n sort.test)
    $
    

    diff was used to show there are no differences between the 2 files generated.

提交回复
热议问题