Read, edit, and write a text file line-wise using Ruby

后端 未结 4 617
长情又很酷
长情又很酷 2020-12-02 11:06

Is there a good way to read, edit, and write files in place in Ruby?

In my online search I\'ve found stuff suggesting to read it all into an array, modify said array

4条回答
  •  孤街浪徒
    2020-12-02 11:51

    You can write in the middle of a file but you have to be carefull to keep the length of the string you overwrite the same otherwise you overwrite some of the following text. I give an example here using File.seek, IO::SEEK_CUR gives he current position of the file pointer, at the end of the line that is just read, the +1 is for the CR character at the end of the line.

    look_for     = "bbb"
    replace_with = "xxxxx"
    
    File.open(DATA, 'r+') do |file|
      file.each_line do |line|
        if (line[look_for])
          file.seek(-(line.length + 1), IO::SEEK_CUR)
          file.write line.gsub(look_for, replace_with)
        end
      end
    end
    __END__
    aaabbb
    bbbcccffffd
    ffffdeee
    eee
    

    After executed, at the end of the script you now have the following, not what you had in mind I assume.

    aaaxxxxx
    bcccffffd
    ffffdeee
    eee
    

    Taking that in consideration, the speed using this technique is much better than the classic 'read and write to a new file' method. See these benchmarks on a file with music data of 1.7 GB big. For the classic approach I used the technique of Wayne. The benchmark is done withe the .bmbm method so that caching of the file doesn't play a very big deal. Tests are done with MRI Ruby 2.3.0 on Windows 7. The strings were effectively replaced, I checked both methods.

    require 'benchmark'
    require 'tempfile'
    require 'fileutils'
    
    look_for      = "Melissa Etheridge"
    replace_with  = "Malissa Etheridge"
    very_big_file = 'D:\Documents\muziekinfo\all.txt'.gsub('\\','/')
    
    def replace_with file_path, look_for, replace_with
      File.open(file_path, 'r+') do |file|
        file.each_line do |line|
          if (line[look_for])
            file.seek(-(line.length + 1), IO::SEEK_CUR)
            file.write line.gsub(look_for, replace_with)
          end
        end
      end
    end
    
    def replace_with_classic path, look_for, replace_with
      temp_file = Tempfile.new('foo')
      File.foreach(path) do |line|
        if (line[look_for])
          temp_file.write line.gsub(look_for, replace_with)
        else
          temp_file.write line
        end
      end
      temp_file.close
      FileUtils.mv(temp_file.path, path)
    ensure
      temp_file.close
      temp_file.unlink
    end
    
    Benchmark.bmbm do |x| 
      x.report("adapt          ") { 1.times {replace_with very_big_file, look_for, replace_with}}
      x.report("restore        ") { 1.times {replace_with very_big_file, replace_with, look_for}}
      x.report("classic adapt  ") { 1.times {replace_with_classic very_big_file, look_for, replace_with}}
      x.report("classic restore") { 1.times {replace_with_classic very_big_file, replace_with, look_for}}
    end 
    

    Which gave

    Rehearsal ---------------------------------------------------
    adapt             6.989000   0.811000   7.800000 (  7.800598)
    restore           7.192000   0.562000   7.754000 (  7.774481)
    classic adapt    14.320000   9.438000  23.758000 ( 32.507433)
    classic restore  14.259000   9.469000  23.728000 ( 34.128093)
    ----------------------------------------- total: 63.040000sec
    
                          user     system      total        real
    adapt             7.114000   0.718000   7.832000 (  8.639864)
    restore           6.942000   0.858000   7.800000 (  8.117839)
    classic adapt    14.430000   9.485000  23.915000 ( 32.195298)
    classic restore  14.695000   9.360000  24.055000 ( 33.709054)
    

    So the in_file replacement was 4 times faster.

提交回复
热议问题