Read a file in chunks in Ruby

后端 未结 5 1487
一个人的身影
一个人的身影 2020-12-15 08:09

I need to read a file in MB chunks, is there a cleaner way to do this in Ruby:

FILENAME=\"d:\\\\tmp\\\\file.bin\"
MEGABYTE = 1024*1024
size = File.size(FILEN         


        
相关标签:
5条回答
  • 2020-12-15 08:42

    Adapted from the Ruby Cookbook page 204:

    FILENAME = "d:\\tmp\\file.bin"
    MEGABYTE = 1024 * 1024
    
    class File
      def each_chunk(chunk_size = MEGABYTE)
        yield read(chunk_size) until eof?
      end
    end
    
    open(FILENAME, "rb") do |f|
      f.each_chunk { |chunk| puts chunk }
    end
    

    Disclaimer: I'm a ruby newbie and haven't tested this.

    0 讨论(0)
  • 2020-12-15 08:42

    If you check out the ruby docs: http://ruby-doc.org/core-2.2.2/IO.html there's a line that goes like this:

    IO.foreach("testfile") {|x| print "GOT ", x }
    

    The only caveat is. Since, this process can read the temp file faster than the generated stream, IMO, a latency should be thrown in.

    IO.foreach("/tmp/streamfile") {|line|
      ParseLine.parse(line)
      sleep 0.3 #pause as this process will discontine if it doesn't allow some buffering 
    }
    
    0 讨论(0)
  • 2020-12-15 08:53

    Alternatively, if you don't want to monkeypatch File:

    until my_file.eof?
      do_something_with( my_file.read( bytes ) )
    end
    

    For example, streaming an uploaded tempfile into a new file:

    # tempfile is a File instance
    File.open( new_file, 'wb' ) do |f|
      # Read in small 65k chunks to limit memory usage
      f.write(tempfile.read(2**16)) until tempfile.eof?
    end
    
    0 讨论(0)
  • 2020-12-15 08:55
    FILENAME="d:/tmp/file.bin"
    
    class File
      MEGABYTE = 1024*1024
    
      def each_chunk(chunk_size=MEGABYTE)
        yield self.read(chunk_size) until self.eof?
      end
    end
    
    open(FILENAME, "rb") do |f|
      f.each_chunk {|chunk| puts chunk }
    end
    

    It works, mbarkhau. I just moved the constant definition to the File class and added a couple of "self"s for clarity's sake.

    0 讨论(0)
  • 2020-12-15 08:56

    You can use IO#each(sep, limit), and set sep to nil or empty string, for example:

    chunk_size = 1024
    File.open('/path/to/file.txt').each(nil, chunk_size) do |chunk|
      puts chunk
    end
    
    0 讨论(0)
提交回复
热议问题