Check whether a PDF-File is valid with Python

后端 未结 7 593
借酒劲吻你
借酒劲吻你 2020-12-08 10:50

I get a File via a HTTP-Upload and need to be sure its a pdf-file. Programing Language is Python, but this should not matter.

I thought of the follow

7条回答
  •  独厮守ぢ
    2020-12-08 11:31

    Update 2020

    It looks like pdfminer.six is a maintained project (the others, including the one below, seem dead).

    ReportLab is another one (mistakenly marked as dead by me)

    Original answer

    Since apparently neither PyPdf nor ReportLab is available anymore, the current solution I found (as of 2015) is to use PyPDF2 and catch exceptions (and possibly analyze getDocumentInfo())

    import PyPDF2
    
    with open("testfile.txt", "w") as f:
        f.write("hello world!")
    
    try:
        PyPDF2.PdfFileReader(open("testfile.txt", "rb"))
    except PyPDF2.utils.PdfReadError:
        print("invalid PDF file")
    else:
        pass
    

提交回复
热议问题