Is there a way to seek through a file without loading the whole thing into an array?

前端 未结 3 613
执念已碎
执念已碎 2021-01-05 15:54

This works:

f = File.new(\"myfile\").readlines
f[0] #=> \"line 1\"
f[21] #=> \"line 22\"

But what if I have a very large file, and on

相关标签:
3条回答
  • 2021-01-05 16:02

    Don't ignore the IO class. IO::foreach is one of those methods that returns an Enumerator, and can be lazily evaluated.

    IO#each_line is also another one that will return an Enumerator.

    In Ruby 2.0 we can call .lazy and use those methods, except for zip and cycle, that allow us to traverse the enumeration without bringing the whole file into memory.

    0 讨论(0)
  • 2021-01-05 16:18

    For the purpose you can use the each_line iterator, combined with with_index to have the line number of the current line (counting from 0):

    File.open('myfile') do |file|
    
      file.each_line.with_index do |line, lineno|
        case lineno
        when 0
          # line 1
        when 21
          # line 22
        end   
      end
    
    end
    

    By using open, passing a block to it, instead of new, you are guaranteed that the file is properly closed at the end of the block execution.


    Update The with_index method accepts an optional argument to specify the starting index to use, so che code above could be better written like this:

    file.each_line.with_index(1) do |line, lineno|
      case lineno
      when 1
        # line 1
      end
    end
    
    0 讨论(0)
  • 2021-01-05 16:20

    I have used Jack and toro2k's answers (roughly the same answer), but modified it for my own use case. Where I may want to: open a file, and seek multiple random lines, where the order may not always be sequential. This is what I came up with (abstracted):

    class LazyFile
        def initialize(file)
            @content = File.new(file)
        end
    
        def [](lineno)
            @content.rewind if @content.lineno > lineno
            skip = lineno - @content.lineno
            skip.times { @content.readline }
            @content.readline
        end
    end
    
    file = LazyFile("myfile")
    file[1001]
    
    0 讨论(0)
提交回复
热议问题