I seem to remember that the Python gzip module previously allowed you to read non-gzipped files transparently. This was really useful, as it allowed to read an input file wh
The best solution for this would be to use something like https://github.com/ahupp/python-magic with libmagic. You simply cannot avoid at least reading a header to identify a file (unless you implicitly trust file extensions)
If you're feeling spartan the magic number for identifying gzip(1) files is the first two bytes being 0x1f 0x8b.
In [1]: f = open('foo.html.gz')
In [2]: print `f.read(2)`
'\x1f\x8b'
gzip.open is just a wrapper around GzipFile, you could have a function like this that just returns the correct type of object depending on what the source is without having to open the file twice:
#!/usr/bin/python
import gzip
def opener(filename):
f = open(filename,'rb')
if (f.read(2) == '\x1f\x8b'):
f.seek(0)
return gzip.GzipFile(fileobj=f)
else:
f.seek(0)
return f