Python gzip refuses to read uncompressed file

前端未结

关注

 4  1698

一整个雨季 2021-01-17 13:31

I seem to remember that the Python gzip module previously allowed you to read non-gzipped files transparently. This was really useful, as it allowed to read an input file wh

4条回答

一个人的身影 (楼主)

2021-01-17 14:25
The best solution for this would be to use something like https://github.com/ahupp/python-magic with libmagic. You simply cannot avoid at least reading a header to identify a file (unless you implicitly trust file extensions)

If you're feeling spartan the magic number for identifying gzip(1) files is the first two bytes being 0x1f 0x8b.
```
In [1]: f = open('foo.html.gz')
In [2]: print `f.read(2)`
'\x1f\x8b'
```
gzip.open is just a wrapper around GzipFile, you could have a function like this that just returns the correct type of object depending on what the source is without having to open the file twice:
```
#!/usr/bin/python

import gzip

def opener(filename):
    f = open(filename,'rb')
    if (f.read(2) == '\x1f\x8b'):
        f.seek(0)
        return gzip.GzipFile(fileobj=f)
    else:
        f.seek(0)
        return f
```
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...