PyPDF 2 Decrypt Not Working

前端未结

关注

 7  1244

悲哀的现实

Currently I am using the PyPDF 2 as a dependency.

I have encountered some encrypted files and handled them as you normally would (in the following code):

相关标签:

7条回答

没有蜡笔的小新

2020-12-15 20:52

You can try PyMuPDF package, it can open encrypted files and solved my problems.

Reference: PyMuPDF Documentation

0 讨论(0)
发布评论:

提交评论
- 加载中...

清酒与你

2020-12-15 21:01

The following code could solve this problem:

import os
import PyPDF2
from PyPDF2 import PdfFileReader

fp = open(filename)
pdfFile = PdfFileReader(fp)
if pdfFile.isEncrypted:
    try:
        pdfFile.decrypt('')
        print('File Decrypted (PyPDF2)')
    except:
        command = ("cp "+ filename +
            " temp.pdf; qpdf --password='' --decrypt temp.pdf " + filename
            + "; rm temp.pdf")
        os.system(command)
        print('File Decrypted (qpdf)')
        fp = open(filename)
        pdfFile = PdfFileReader(fp)
else:
    print('File Not Encrypted')

0 讨论(0)

广开言路

2020-12-15 21:02

Thanks @Zijian He, your solution is worked for me. Solution is, edit pdf.py file of pypdf2 package

def getNumPages(self,password =''):
    """
    Calculates the number of pages in this PDF file.

    :return: number of pages
    :rtype: int
    :raises PdfReadError: if file is encrypted and restrictions prevent
        this action.
    """

    # Flattened pages will not work on an Encrypted PDF;
    # the PDF file's page count is used in this case. Otherwise,
    # the original method (flattened page count) is used.
    if self.isEncrypted:
        try:
            self._override_encryption = True
            self.decrypt(password)
            return self.trailer["/Root"]["/Pages"]["/Count"]
        except:
            raise utils.PdfReadError("File has not been decrypted")
        finally:
            self._override_encryption = False
    else:
        if self.flattenedPages == None:
            self._flatten()
        return len(self.flattenedPages)

numPages = property(lambda self: self.getNumPages(), None, None)

0 讨论(0)

忘掉有多难

2020-12-15 21:03

To Answer My Own Question: If you have ANY spaces in your file name, then PyPDF 2 decrypt function will ultimately fail despite returning a success code. Try to stick to underscores when naming your PDFs before you run them through PyPDF2.

For example,

Rather than "FDJKL492019 21490 ,LFS.pdf" do something like "FDJKL492019_21490_,LFS.pdf".

0 讨论(0)
发布评论:

提交评论
- 加载中...

悲哀的现实

2020-12-15 21:06

It has nothing to do with whether the file has been decrypted or not when using the method getNumPages().

If we take a look at the source code of getNumPages():

def getNumPages(self):
    """
    Calculates the number of pages in this PDF file.

    :return: number of pages
    :rtype: int
    :raises PdfReadError: if file is encrypted and restrictions prevent
        this action.
    """

    # Flattened pages will not work on an Encrypted PDF;
    # the PDF file's page count is used in this case. Otherwise,
    # the original method (flattened page count) is used.
    if self.isEncrypted:
        try:
            self._override_encryption = True
            self.decrypt('')
            return self.trailer["/Root"]["/Pages"]["/Count"]
        except:
            raise utils.PdfReadError("File has not been decrypted")
        finally:
            self._override_encryption = False
    else:
        if self.flattenedPages == None:
            self._flatten()
        return len(self.flattenedPages)

we will notice that it is the self.isEncrypted property controlling the flow. And as we all know the isEncrypted property is read-only and not changeable even when the pdf is decrypted.

So, the easy way to handle the situation is just add the password as key-word argument with empty string as default value and pass your password when using the getNumPages() method and any other method build beyond it

0 讨论(0)

死守一世寂寞

2020-12-15 21:09
This error may come about due to 128-bit AES encryption on the pdf, see Query - is there a way to bypass security restrictions on a pdf?

One workaround is to decrypt all isEncrypted pdfs with "qpdf"
```
qpdf --password='' --decrypt input.pdf output.pdf
```
Even if your PDF does not appear password protected, it may still be encrypted with no password. The above snippet assumes this is the case.
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页