Lucene performance

前端 未结 2 826
一生所求
一生所求 2021-01-05 13:39

could you please suggest on the steps to be followed for lucene performance. especially with large data (around 1TB of pdf files to be indexed)

2条回答
  •  既然无缘
    2021-01-05 14:23

    1. Read Scaling Lucene and Solr.
    2. Define your needs from Lucene (for example: you are indexing PDFs - do you need to store the full text, just to make it searchable, or not at all?)
    3. Make a small-scale experiment - index a few documents, see whether retrieval is good enough.
    4. Try to index the whole thing (considering the paper's tips for quick indexing and for indexing for retrieval speed) - Is retrieval good enough? Is performance good enough?
    5. Iterate.

提交回复
热议问题