How do I index documents in SOLR?

混江龙づ霸主 提交于 2019-12-30 04:40:08

问题


Im running Solr 1.4 on Ubuntu 10.04 (installed via apt-get solr-tomcat) and it seems to be working fine. Im having some difficulty finding any coherent info on how to index documents though. Im new to SOLR so bear with me! I have a folder (/mnt/folder) that is a mounted windows share, which contains Word and PDF files that I would like indexed, whats the easiest way to get SOLR to index the entire folder?

The documentation for SOLR is pretty poor, its impossilbe to find any decent tutorials on getting things done with it so any help is greatly appreciated!

S


回答1:


Take a look at the Solr wiki, it's a pretty thorough documentation.

In particular see the ExtractingRequestHandler, which allows you to index binary files like Word and PDF documents. Here's an introduction to the topic.

If the wiki isn't enough for you, there's also a great book about Solr.




回答2:


Processing rich documents with Solr: http://wiki.apache.org/solr/UpdateRichDocuments




回答3:


I have found the same challenges with the core documentation, but I came across this very useful reference guide from LucidImagination, which helped to clarify a lot of thing about SOLR:

http://docs.lucidworks.com/display/solr/Apache+Solr+Reference+Guide



来源:https://stackoverflow.com/questions/2802000/how-do-i-index-documents-in-solr

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!