Python bz2 uncompressed file size

前提是你 提交于 2019-11-28 05:34:41

问题


I am using Python 2.7. I have a .bz2 file, and I need to figure out the uncompressed file size of its component file without actually decompressing it. I have found ways to do this for gzip and tar files. Anyone know of a way for bz2 files?

Thanks very much


回答1:


I suspect this is impossible due to the nature of bz2 format and compressing techniques it uses. Here is a quite good description of the both format and the algorithms http://en.wikipedia.org/wiki/Bzip2#File_format

You will never know original data size until you decompress it.




回答2:


As the other answers have stated, this is not possible without decompressing the data. However, if the size of the decompressed data is large, this can be done by decompressing it in chunks and adding the size of the chunks:

>>> import bz2
>>> with bz2.BZ2File('data.bz2', 'r') as data:
...     size = 0
...     chunk = data.read(1024)
...     while chunk:
...         size += len(chunk)
...         chunk = data.read(1024)
... 
>>> size
11107

Alternatively (and probably faster, though I haven't profiled this) you can seek() to the end of the file and then use tell() to find out how long it is:

>>> import bz2
>>> import os
>>> with bz2.BZ2File('data.bz2', 'r') as data:
...     data.seek(0, os.SEEK_END)
...     size = data.tell()
...
>>> size
11107L



回答3:


It seems that telling the size of bz2 file without actually decompressing it is impossible. See the link for more details and a possible solution: https://superuser.com/questions/53984/is-there-a-way-to-determine-the-decompressed-size-of-a-bz2-file



来源:https://stackoverflow.com/questions/12647738/python-bz2-uncompressed-file-size

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!