Check whether a PDF-File is valid with Python

后端 未结 7 600
借酒劲吻你
借酒劲吻你 2020-12-08 10:50

I get a File via a HTTP-Upload and need to be sure its a pdf-file. Programing Language is Python, but this should not matter.

I thought of the follow

7条回答
  •  情歌与酒
    2020-12-08 11:22

    If you're on a Linux or OS X box, you could use Pdftotext (part of Xpdf, found here). If you pass a non-PDF to pdftotext, it will certainly bark at you, and you can use commands.getstatusoutput to get the output and parse it for these warnings.

    If you're looking for a platform-independent solution, you might be able to make use of pyPdf.

    Edit: It's not elegant, but it looks like pyPdf's PdfFileReader will throw an IOError(22) if you attempt to load a non-PDF.

提交回复
热议问题