Create a zip file from a generator in Python?

后端 未结 10 1903
佛祖请我去吃肉
佛祖请我去吃肉 2020-11-30 07:32

I\'ve got a large amount of data (a couple gigs) I need to write to a zip file in Python. I can\'t load it all into memory at once to pass to the .writestr method of ZipFil

10条回答
  •  感情败类
    2020-11-30 08:12

    The only solution is to rewrite the method it uses for zipping files to read from a buffer. It would be trivial to add this to the standard libraries; I'm kind of amazed it hasn't been done yet. I gather there's a lot of agreement the entire interface needs to be overhauled, and that seems to be blocking any incremental improvements.

    import zipfile, zlib, binascii, struct
    class BufferedZipFile(zipfile.ZipFile):
        def writebuffered(self, zipinfo, buffer):
            zinfo = zipinfo
    
            zinfo.file_size = file_size = 0
            zinfo.flag_bits = 0x00
            zinfo.header_offset = self.fp.tell()
    
            self._writecheck(zinfo)
            self._didModify = True
    
            zinfo.CRC = CRC = 0
            zinfo.compress_size = compress_size = 0
            self.fp.write(zinfo.FileHeader())
            if zinfo.compress_type == zipfile.ZIP_DEFLATED:
                cmpr = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION, zlib.DEFLATED, -15)
            else:
                cmpr = None
    
            while True:
                buf = buffer.read(1024 * 8)
                if not buf:
                    break
    
                file_size = file_size + len(buf)
                CRC = binascii.crc32(buf, CRC) & 0xffffffff
                if cmpr:
                    buf = cmpr.compress(buf)
                    compress_size = compress_size + len(buf)
    
                self.fp.write(buf)
    
            if cmpr:
                buf = cmpr.flush()
                compress_size = compress_size + len(buf)
                self.fp.write(buf)
                zinfo.compress_size = compress_size
            else:
                zinfo.compress_size = file_size
    
            zinfo.CRC = CRC
            zinfo.file_size = file_size
    
            position = self.fp.tell()
            self.fp.seek(zinfo.header_offset + 14, 0)
            self.fp.write(struct.pack("

提交回复
热议问题