Lucens best way to do “starts-with” queries

主宰稳场 提交于 2019-12-05 18:42:00

You can use Lucene Payloads for that. You can give custom boost for every term of the field value.

So, when you index your titles you can start using a boost factor of 3 (for example):

title: wild|3.0 creatures|2.5 blue|2.0 sea|1.5

title: sea|3.0 creatures|2.5

Indexing this way you are boosting nearest terms to the start of title.

The main problem using this approach is you have to tokenize by yourself and add all this boost information "manually" as the Analyzer needs the text structured that way (term1|1.1 term2|3.0 term3).

What you could do is index the title and each token separately, e.g. text wild deep blue endless sea would be indexed like:

title: wild deep blue endless sea
t1: wild
t2: deep
t3: blue
t4: endless
t5: sea

Then if someone queries "wild deep", the query would be rewritten into

title:"wild deep" OR (t1:wild AND t2:deep)

This way you will always find all matching documents (if they match title) but matching t1..tN tokens will score the relevant documents higher.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!