I\'m trying to parse a few PDF files that contain engineering drawings to obtain text data in the files. I tried using TIKA as a jar with python and using it with the jnius
You need to download the Tika Server Jar and run it first. Check this link: http://wiki.apache.org/tika/TikaJAXRS
java -jar tika-server-x.x.jar --port xxxxtika.initVM() Add tika.TikaClientOnly = True instead of tika.initVM()parsed = parser.from_file('/path/to/file') to
parsed = parser.from_file('/path/to/file', '/path/to/server') You will get the server path in Step 2. when the tika server initiates - just plug that in hereGood luck!