How to tell if a file is gzip compressed?

前端 未结 6 1347
挽巷
挽巷 2021-01-03 19:21

I have a Python program which is going to take text files as input. However, some of these files may be gzip compressed.

Is there a cross-platform, usable from Py

6条回答
  •  春和景丽
    2021-01-03 19:48

    The magic number for gzip compressed files is 1f 8b. Although testing for this is not 100% reliable, it is highly unlikely that "ordinary text files" start with those two bytes—in UTF-8 it's not even legal.

    Usually gzip compressed files sport the suffix .gz though. Even gzip(1) itself won't unpack files without it unless you --force it to. You could conceivably use that, but you'd still have to deal with a possible IOError (which you have to in any case).

    One problem with your approach is, that gzip.GzipFile() will not throw an exception if you feed it an uncompressed file. Only a later read() will. This means, that you would probably have to implement some of your program logic twice. Ugly.

提交回复
热议问题