python: read lines from compressed text files

非 Y 不嫁゛ 提交于 2019-11-26 07:38:28

问题


Is it easy to read a line from a gz-compressed text file using python without extracting the file completely? I have a text.gz file which is aroud 200mb. When I extract it, it becomes 7.4gb. And this is not the only file I have to read. For the total process, I have to read 10 files. Although this will be a sequential job, I think it will a smart thing to do it without extarcting the whole information. I do not even know that it is possible. How can it be done using python? I need to read a text file line-by-line.


回答1:


Have you tried using gzip.GzipFile? Arguments are similar to open.




回答2:


Using gzip.GzipFile:

import gzip

with gzip.open('input.gz','rt') as f:
    for line in f:
        print('got line', line)

Note: gzip.open(filename, mode) is an alias for gzip.GzipFile(filename, mode). I prefer the former, as it looks similar to with open(...) as f: used for opening uncompressed files.




回答3:


You could use the standard gzip module in python. Just use:

gzip.open('myfile.gz')

to open the file as any other file and read its lines.

More information here: Python gzip module



来源:https://stackoverflow.com/questions/10566558/python-read-lines-from-compressed-text-files

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!