Read, edit, and write a text file line-wise using Ruby

后端 未结 4 610
长情又很酷
长情又很酷 2020-12-02 11:06

Is there a good way to read, edit, and write files in place in Ruby?

In my online search I\'ve found stuff suggesting to read it all into an array, modify said array

4条回答
  •  南方客
    南方客 (楼主)
    2020-12-02 11:51

    In general, there's no way to make arbitrary edits in the middle of a file. It's not a deficiency of Ruby. It's a limitation of the file system: Most file systems make it easy and efficient to grow or shrink the file at the end, but not at the beginning or in the middle. So you won't be able to rewrite a line in place unless its size stays the same.

    There are two general models for modifying a bunch of lines. If the file is not too large, just read it all into memory, modify it, and write it back out. For example, adding "Kilroy was here" to the beginning of every line of a file:

    path = '/tmp/foo'
    lines = IO.readlines(path).map do |line|
      'Kilroy was here ' + line
    end
    File.open(path, 'w') do |file|
      file.puts lines
    end
    

    Although simple, this technique has a danger: If the program is interrupted while writing the file, you'll lose part or all of it. It also needs to use memory to hold the entire file. If either of these is a concern, then you may prefer the next technique.

    You can, as you note, write to a temporary file. When done, rename the temporary file so that it replaces the input file:

    require 'tempfile'
    require 'fileutils'
    
    path = '/tmp/foo'
    temp_file = Tempfile.new('foo')
    begin
      File.open(path, 'r') do |file|
        file.each_line do |line|
          temp_file.puts 'Kilroy was here ' + line
        end
      end
      temp_file.close
      FileUtils.mv(temp_file.path, path)
    ensure
      temp_file.close
      temp_file.unlink
    end
    

    Since the rename (FileUtils.mv) is atomic, the rewritten input file will pop into existence all at once. If the program is interrupted, either the file will have been rewritten, or it will not. There's no possibility of it being partially rewritten.

    The ensure clause is not strictly necessary: The file will be deleted when the Tempfile instance is garbage collected. However, that could take a while. The ensure block makes sure that the tempfile gets cleaned up right away, without having to wait for it to be garbage collected.

提交回复
热议问题