Use tika with python, runtimeerror: unable to start tika server

那年仲夏 提交于 2019-11-27 06:40:35

问题


I am trying to use the tika package to Parse files. Tika is successfully installed, tika-server-1.18.jar runned with Code in cmd Java -jar tika-server-1.18.jar

My code in the Jupyter is:

Import tika 
from tika Import parser
parsed = parser.from_file('')

However, I receive below error:

2018-07-25 10:20:13,325 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2018-07-25 10:20:18,329 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2018-07-25 10:20:23,332 [MainThread ] [WARNI] Failed to see startup log message; retrying... 2018-07-25 10:20:28,340 [MainThread ] [ERROR] Tika startup log message not received after 3 tries. 2018-07-25 10:20:28,340 [MainThread ] [ERROR] Failed to receive startup confirmation from startServer.

RuntimeError: Unable to start Tika Server.


回答1:


According to Apache Tika's site, all new versions of the tika-server.jar will require Java 8.

24 April 2018: Apache Tika Release Apache Tika 1.18 has been released! This release includes bug fixes (e.g. extraction from grouped shapes in PPT), security fixes and upgrades to dependencies. PLEASE NOTE: The next versions will require Java 8. Please see the CHANGES.txt file for the full list of changes in the release and have a look at the download page for more information on how to obtain Apache Tika 1.18.

Current outdated docs for tika Python library claim that Java 7 is needed, but now Java 8 must be installed. This is because the current version of tika-server.jar is automatically downloaded at runtime if not found in your temp file.

After installing Java 8, my basic test code launched the server and worked without error.




回答2:


You have not passed an argument (specified a file) in your line:

parsed = parser.from_file('')

Give it a file to chew on e.g.,

parsed = parser.from_file('myfile.txt')

The server didn't start & presumably this no log warning gets triggered - see line 644 in the source at the Github

then another error message tells you it ain't going to play...




回答3:


Download Java. If you already have a version of Java installed, try updating it to the latest version. The version that works for me is 1.18.



来源:https://stackoverflow.com/questions/51514246/use-tika-with-python-runtimeerror-unable-to-start-tika-server

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!