问题
How can you run pypdfocr from within a Python script, as opposed to the command line?
This question How to call pypdfocr functions to use them in a python script? approaches the answer I want, but doesn't quite get there.
import pypdfocr
from pypdfocr import pypdfocr
from pypdfocr.pypdfocr import PyPDFOCR as pocr
filepath = 'C:/myfolder/myPDF.pdf'
newfile = pocr.run_conversion(filepath)
This throws an error:
Unbound method run_conversion must be called with PyPDFOCR instance as first argument.
Can someone help me fill in the (likely obvious) missing piece?
回答1:
The problem is that you are trying to run run_conversion without an object.
run_conversion
is a method of the class PyPDFOCR
. So you will need an object of that class to run the method.
Once you have made an PyPDFOCR
object (for instance my_ocr
), you should be able to write:
newfile = my_ocr.run_conversion(filepath)
回答2:
I made a system call with success.
cmd = "pypdfocr '"+str(file)+"'"
os.system(cmd)
来源:https://stackoverflow.com/questions/47639233/using-pypdfocr-library-from-within-a-python-script