Lucene performance

主宰稳场 提交于 2019-12-19 03:14:26

问题


could you please suggest on the steps to be followed for lucene performance. especially with large data (around 1TB of pdf files to be indexed)


回答1:


  1. Read Scaling Lucene and Solr.
  2. Define your needs from Lucene (for example: you are indexing PDFs - do you need to store the full text, just to make it searchable, or not at all?)
  3. Make a small-scale experiment - index a few documents, see whether retrieval is good enough.
  4. Try to index the whole thing (considering the paper's tips for quick indexing and for indexing for retrieval speed) - Is retrieval good enough? Is performance good enough?
  5. Iterate.



回答2:


Please check the tips on the question Optimizing Lucene Performance. Since you are working with large amount of data, you also need to watch the index creation performance. Some tips on improving indexing performance and search performance are available on Lucene Wiki.



来源:https://stackoverflow.com/questions/824973/lucene-performance

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!