问题
I'm using java with apache tika 1.18 to convert some files to TXT. When I try to use the AutoDetectParser(), I'm getting the error :
[ERROR ] Error occurred during error handling, give up! org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String; [ERROR ] SRVE0777E: Exception thrown by application class 'org.apache.cxf.service.invoker.AbstractInvoker.createFault:162' org.apache.cxf.interceptor.Fault: org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String; at org.apache.cxf.service.invoker.AbstractInvoker.createFault(AbstractInvoker.java:162) at [internal classes] Caused by: java.lang.NoSuchMethodError: org.apache.commons.compress.archivers.ArchiveStreamFactory.detect(Ljava/io/InputStream;)Ljava/lang/String;
I was dinging on internet and found this error related wrong version of commom_compress, appears this method doesn't exist in versions previous of 1.14 of commom_compress. In my case the version is 1.16.1.
After build the project, I checked the libs inside and there is only the correct version.
I'm using IBM Liberty 18.0 ... and now I'm really lost about options to solve this problem.
When I use the specific parser, like PDFParser(), everything works fine!
Any ideas?
Thanks
回答1:
Source of the issue:
Spark 2.x
distributions include old versions of commons-compress, while Tika
library depends on version 1.18 of commons-compress
library.
Solution
Use --driver-class-path argument in your spark-shell or spark-submit to point to a the right version of commons-compress library.
spark-submit
--driver-class-path ~/.m2/repository/org/apache/commons/commons-compress/1.18/commons-compress-1.18.jar
--class {you.main.class}
....
Check my detailed answer here.
来源:https://stackoverflow.com/questions/50412099/apache-tika-archivestreamfactory-detect-error