Compress large file in ruby with Zlib for gzip

寵の児 提交于 2019-12-06 05:31:23

问题


I have a very large file, approx. 200 million rows of data.

I would like to compress it with the Zlib library, specifically using the Writer.

Reading through each line one at at time seems like it would take quite a bit of time. Is there a better way to accomplish this?

Here is what I have right now:

require 'zlib'

Zlib::GzipWriter.open('compressed_file.gz') do |gz|
 File.open(large_data_file).each do |line|
   gz.write line
 end
 gz.close
end

回答1:


You can use IO#read to read a chunk of arbitrary length from the file.

require 'zlib'

Zlib::GzipWriter.open('compressed_file.gz') do |gz|
 File.open(large_data_file) do |fp|
   while chunk = fp.read(16 * 1024) do
     gz.write chunk
   end
 end
 gz.close
end

This will read the source file in 16kb chunks and add each compressed chunk to the output stream. Adjust the block size to your preference based on your environment.



来源:https://stackoverflow.com/questions/24496799/compress-large-file-in-ruby-with-zlib-for-gzip

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!