发表新帖

发表新帖

Check whether a PDF-File is valid with Python

后端未结

关注

 7  599

借酒劲吻你 2020-12-08 10:50

I get a File via a HTTP-Upload and need to be sure its a pdf-file. Programing Language is Python, but this should not matter.

I thought of the follow

7条回答

隐瞒了意图╮ (楼主)

2020-12-08 11:18
The two most commonly used PDF libraries for Python are:
- pyPdf
- ReportLab
Both are pure python so should be easy to install as well be cross-platform.

With pyPdf it would probably be as simple as doing:
```
from pyPdf import PdfFileReader
doc = PdfFileReader(file("upload.pdf", "rb"))
```
This should be enough, but doc will now have documentInfo() and numPages() methods if you want to do further checking.

As Carl answered, pdftotext is also a good solution, and would probably be faster on very large documents (especially ones with many cross-references). However it might be a little slower on small PDF's due to system overhead of forking a new process, etc.
0 讨论(0)

查看其它7个回答
发布评论:

提交评论
- 加载中...

热议问题