问题
How can I do a proximity search for two multi-word phrases in Lucene. For example, I want to find all black lab* (black labrador, black labradoodle, etc) withing 5 words of the phrase "pet shop". Which analyzer should I be using? Which query parser would be recommended? I'm working with Lucene.NET. I've ported the ComplexPhraseQueryParser from Java to C#, but that parser doesn't seem to be doing the trick (or perhaps I'm just using it wrong). I'm just getting started with Lucene, so your help is much appreciated.
回答1:
You can use a SpanQuery for this:
new SpanNearQuery(
new SpanQuery[] {
new SpanNearQuery(
new SpanQuery[] {
new SpanTermQuery(new Term(FIELD, "black")),
new SpanMultiTermQueryWrapper<WildcardQuery>(new WildcardQuery(new Term(FIELD, "lab*"))),
},
0,
true),
new SpanNearQuery(
new SpanQuery[] {
new SpanTermQuery(new Term(FIELD, "pet")),
new SpanTermQuery(new Term(FIELD, "shop")),
},
0,
true),
},
5,
true);
The default Lucene QueryParser
doesn't support span queries, but you could try the Surround query parser. I couldn't find much else in the way of documentation.
You may also find this answer and this blog post useful.
回答2:
You just need to set the slop.
来源:https://stackoverflow.com/questions/10800357/black-lab-pet-shop5-in-lucene-proximity-search-with-multi-word-phrases