Ruby streaming tar/gz

老子叫甜甜 提交于 2019-12-21 14:41:48

问题


Basically I want to stream data from memory into a tar/gz format (possibly multiple files into the tar, but it should NEVER TOUCH THE HARDDRIVE, only streaming!), then stream them somewhere else (an HTTP request body in my case).

Anyone know of an existing library that can do this? Is there something in Rails?

libarchive-ruby is only a C wrapper and seems like it would be very platform-dependent (the docs want you to compile as an installation step?!).

SOLUTION:

require 'zlib'
require 'rubygems/package'

tar = StringIO.new

Gem::Package::TarWriter.new(tar) { |writer|
  writer.add_file("a_file.txt", 0644) { |f| 
    (1..1000).each { |i| 
      f.write("some text\n")
    }
  }
  writer.add_file("another_file.txt", 0644) { |f| 
    f.write("some more text\n")
  }
}
tar.seek(0)

gz = Zlib::GzipWriter.new(File.new('this_is_a_tar_gz.tar.gz', 'wb'))  # Make sure you use 'wb' for binary write!
gz.write(tar.read)
tar.close
gz.close

That's it! You can swap out the File in the GzipWriter with any IO to keep it streaming. Cookies for dw11wtq!


回答1:


Take a look at the TarWriter class in rubygems: http://rubygems.rubyforge.org/rubygems-update/Gem/Package/TarWriter.html it just operates on an IO stream, which may be a StringIO.

tar = StringIO.new

Gem::Package::TarWriter.new(tar) do |writer|
  writer.add_file("hello_world.txt", 0644) { |f| f.write("Hello world!\n") }
end

tar.seek(0)

p tar.read #=> mostly padding, but a tar nonetheless

It also provides methods to add directories if you need a directory layout in the tarball.

For reference, you could achieve the gzipping with IO.popen, just piping the data in/out of the system process:

http://www.ruby-doc.org/core-1.9.2/IO.html#method-c-popen

The gzipping itself would look something like this:

gzippped_data = IO.popen("gzip", "w+") do |gzip|
  gzip.puts "Hello world!"
  gzip.close_write
  gzip.read
end
# => "\u001F\x8B\b\u0000\xFD\u001D\xA2N\u0000\u0003\xF3H\xCD\xC9\xC9W(\xCF/\xCAIQ\xE4\u0002\u0000A䩲\r\u0000\u0000\u0000"



回答2:


Based on the solution OP wrote, I wrote fully on-memory tgz archive function what I want to use to POST to web server.

  # Create tar gz archive file from files, on the memory.
  # Parameters:
  #   files: Array of hash with key "filename" and "body"
  #     Ex: [{"filename": "foo.txt", "body": "This is foo.txt"},...]
  #
  # Return:: tar_gz archived image as string
  def create_tgz_archive_from_files(files)
    tar = StringIO.new
    Gem::Package::TarWriter.new(tar){ |tar_writer|
      files.each{|file|
        tar_writer.add_file(file['filename'], 0644){|f|
          f.write(file['body'])
        }
      }
    }
    tar.rewind

    gz = StringIO.new('', 'r+b')
    gz.set_encoding("BINARY")
    gz_writer = Zlib::GzipWriter.new(gz)
    gz_writer.write(tar.read)
    tar.close
    gz_writer.finish
    gz.rewind
    tar_gz_buf = gz.read
    return tar_gz_buf
  end


来源:https://stackoverflow.com/questions/7856491/ruby-streaming-tar-gz

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!