Choosing a solr/lucene commit strategy

隐身守侯 提交于 2019-12-04 11:33:51

Use Solr's default auto-commit values, which I believe are quite reasonable. If not, you can adjust them to suit your needs:

<!-- autocommit pending docs if certain criteria are met.  Future versions may expand the available
 criteria -->
<autoCommit>
  <maxDocs>10000</maxDocs> <!-- maximum uncommited docs before autocommit triggered -->
  <maxTime>50000</maxTime> <!-- maximum time (in MS) after adding a doc before an autocommit is triggered -->
</autoCommit>

This means that it will commit when there are more than 10000 docs waiting to be committed, or 50s have passed since a document was added.

According to the Lucene 2.9.3 documentation, commit() allows readers to see the added documents and puts all added/deleted documents on the index in the disk. It is a costly operation.

So if you want to see part of the documents while adding others, or want an assurance that you will not lose an added set of documents larger than 10,000 documents, you need to commit every 10,000 records.

OTOH, If you prefer to save the extra commits time, and are not afraid to lose documents if the machine fails, commit only after all of the documents were added.

The recommended way is to use commitWithin instead of <autoCommit>.

If you are using SolrJ, almost all methods have a commitWithin parameter to use this feature.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!