So here\'s the problem. I have sample.gz file which is roughly 60KB in size. I want to decompress the first 2000 bytes of this file. I am running into CRC check failed error
I also encounter this problem when I use my python script to read compressed files generated by gzip tool under Linux and the original files were lost.
By reading the implementation of gzip.py of Python, I found that gzip.GzipFile had similar methods of File class and exploited python zip module to process data de/compressing. At the same time, the _read_eof() method is also present to check the CRC of each file.
But in some situations, like processing Stream or .gz file without correct CRC (my problem), an IOError("CRC check failed") will be raised by _read_eof(). Therefore, I try to modify the gzip module to disable the CRC check and finally this problem disappeared.
def _read_eof(self):
pass
https://github.com/caesar0301/PcapEx/blob/master/live-scripts/gzip_mod.py
I know it's a brute-force solution, but it save much time to rewrite yourself some low level methods using the zip module, like of reading data chuck by chuck from the zipped files and extract the data line by line, most of which has been present in the gzip module.
Jamin