“”black lab*“ ”pet shop“”~5 in Lucene (proximity search with multi-word phrases)

孤街醉人 提交于 2019-12-08 03:45:15

问题


How can I do a proximity search for two multi-word phrases in Lucene. For example, I want to find all black lab* (black labrador, black labradoodle, etc) withing 5 words of the phrase "pet shop". Which analyzer should I be using? Which query parser would be recommended? I'm working with Lucene.NET. I've ported the ComplexPhraseQueryParser from Java to C#, but that parser doesn't seem to be doing the trick (or perhaps I'm just using it wrong). I'm just getting started with Lucene, so your help is much appreciated.


回答1:


You can use a SpanQuery for this:

new SpanNearQuery(
    new SpanQuery[] {
        new SpanNearQuery(
            new SpanQuery[] {
                new SpanTermQuery(new Term(FIELD, "black")),
                new SpanMultiTermQueryWrapper<WildcardQuery>(new WildcardQuery(new Term(FIELD, "lab*"))),
            },
            0,
            true),
        new SpanNearQuery(
            new SpanQuery[] {
                new SpanTermQuery(new Term(FIELD, "pet")),
                new SpanTermQuery(new Term(FIELD, "shop")),
            },
            0,
            true),
    },
    5,
    true);

The default Lucene QueryParser doesn't support span queries, but you could try the Surround query parser. I couldn't find much else in the way of documentation.

You may also find this answer and this blog post useful.




回答2:


You just need to set the slop.



来源:https://stackoverflow.com/questions/10800357/black-lab-pet-shop5-in-lucene-proximity-search-with-multi-word-phrases

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!