Apache Jackrabbit JCA 2.7.5 .docx and .xlsx indexing

烂漫一生 提交于 2019-11-29 05:31:12

Ref: http://jackrabbit.510166.n4.nabble.com/Office-2007-documents-not-being-indexed-in-Jackrabbit-2-4-3-td4657380.html

On the same line, I have observed commons-compress-1.5.jar is required by Tika parser in case of OOXML types of documents (i.e. office 2007 documents).

Now, I am able to index & search most of types of documents (office 2007 - docx, pptx, xlsx , office 2003 - doc, ppt, xls, PDF) using below 2 steps:

(1) Updated repository.xml & added Further details can be found at https://issues.apache.org/jira/browse/JCR-3287

(2) Added commons-compress-1.5.jar classpath while running jackrabbit-standalone-2.6.2.jar

The solution is focused on JARs of the jackrabbit-jca-2.7.5.rar!

There are errors on dependency so I make these change :

  • add apache-mime4j-0.6.jar
  • add apache-mime4j-core-0.7.jar
  • add commons-compress-1.5.jar

Add these JARs in the jackrabbit-jca-2.7.5.rar before deploying this!

And the indexation of .docx, .xlsx, ... wors succesfully!

Thank you for @Ashok Felix

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!