Using ElasticSearch and/or Solr as a datastore for MS Office and PDF documents

前端 未结 5 1603
一生所求
一生所求 2020-12-23 10:31

I\'m currently designing a full text search system where users perform text queries against MS Office and PDF documents, and the result will return a list of documents that

5条回答
  •  不思量自难忘°
    2020-12-23 11:18

    Both Solr and Elasticsearch will index the content of the document. Solr has that built-in, Elasticsearch needs a plugin. Easy either way and both use Tika under the covers.

    Neither of them will store the document itself. You can try making them do it, but they are not designed for it and you will suffer.

    Additionally, neither Solr nor Elasticsearch are currently recommended as a primary storage. They can do it, but it is not as mission critical for them as - say - for a filesystem implementation.

    So, I would recommend having the files somewhere else and using Solr/Elasticsearch for searching only. That's where they shine.

提交回复
热议问题