Unzipping part of a .gz file using python

后端 未结 4 1065
半阙折子戏
半阙折子戏 2020-12-06 02:59

So here\'s the problem. I have sample.gz file which is roughly 60KB in size. I want to decompress the first 2000 bytes of this file. I am running into CRC check failed error

4条回答
  •  一个人的身影
    2020-12-06 03:10

    I also encounter this problem when I use my python script to read compressed files generated by gzip tool under Linux and the original files were lost.

    By reading the implementation of gzip.py of Python, I found that gzip.GzipFile had similar methods of File class and exploited python zip module to process data de/compressing. At the same time, the _read_eof() method is also present to check the CRC of each file.

    But in some situations, like processing Stream or .gz file without correct CRC (my problem), an IOError("CRC check failed") will be raised by _read_eof(). Therefore, I try to modify the gzip module to disable the CRC check and finally this problem disappeared.

    def _read_eof(self):
        pass
    

    https://github.com/caesar0301/PcapEx/blob/master/live-scripts/gzip_mod.py

    I know it's a brute-force solution, but it save much time to rewrite yourself some low level methods using the zip module, like of reading data chuck by chuck from the zipped files and extract the data line by line, most of which has been present in the gzip module.

    Jamin

提交回复
热议问题