Solr faceted search performance recommendations

时光总嘲笑我的痴心妄想 提交于 2019-12-03 08:22:48
jpountz

Since Solr computes facets on in-memory data-structures, facet computation is likely to be CPU-bound. The code to compute facets is already highly optimised (the getCounts method in UnInvertedField for a multi-valued field).

One idea would be to parallelize the computation. Maybe the easiest way to do this would be to split your collection into several shards as described in Do multiple Solr shards on a single machine improve performance?.

Otherwise, if your term dictionary is small enough and if queries can take a limited number of forms, you could set up a different system that would maintain the count matrix for every (term, query) pair. For example, if you only allow term queries, this means you should maintain the counts for every pair of terms. Beware that this would require a lot of disk space depending of the total number of terms and queries. If you don't require the counts to be exact, maybe the easiest would be to compute these counts in a batch process. Otherwisee, it might be (possible, but) a little bit tricky to keep the counts sync'd with Solr.

You could use the topTerms feature of LukeRequestHandler.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!