Lucene .net Boost not working when using * wildcard

血红的双手。 提交于 2019-12-06 23:31:03

问题


I have two documents and using Luke to investigate, I have confirmed in code that it has the same behavior, using StandardAnalyzer.

Document one with boost 1

stored/uncompressed,indexed,tokenized<Description:Nummer ett>
stored/uncompressed,indexed,tokenized<Id:2>
stored/uncompressed,indexed,tokenized<Name:Apa>

Document two with boost 2

stored/uncompressed,indexed,tokenized<Description:Nummer två>
stored/uncompressed,indexed,tokenized<Id:1>
stored/uncompressed,indexed,tokenized<Name:Apa>

Search apa in field Name Returns with boost used and in the correct order.

Document 2 has Score 1,1891
Document 1 has Score 0.5945

Search ap* Returns in no order and same score

Document 1 Score 1.0000
Document 2 Score 1.0000

Search apa* Returns in no order and same score

Document 1 Score 1.0000
Document 2 Score 1.0000

Why is this? I would like to return some documents with higher boost value even if I have to use wildcards. Is this possible?

Cheers all cool coders out there!

This is what I want to accomplice.

A search string and want matches. Using wildcard. Search "Lu" +"*"

Document
 Name
 City

I would like the Document whose Name is Lund to get higher rating than the document with the Name Lunt or City is Lund for example. This is due to I will know which documents that are most popular. I want to get the documents with city Stockholm and names Stockholm and Stockholmen but ordered as I choose.


回答1:


Since WildcardQuery is a subclass of MultiTermQuery you are getting constant score of 1.

If you check the definition of t.getBoost():

t.getBoost() is a search time boost of term t in the query q as specified in the query text (see query syntax), or as set by application calls to setBoost(). Notice that there is really no direct API for accessing a boost of one term in a multi term query, but rather multi terms are represented in a query as multi TermQuery objects, and so the boost of a term in the query is accessible by calling the sub-query getBoost()

http://lucene.apache.org/core/old_versioned_docs/versions/3_0_1/api/core/org/apache/lucene/search/Similarity.html#formula_termBoost

One possible hack could be to set rewrite method of query parser:

myCustomQueryParser.SetMultiTermRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE)


来源:https://stackoverflow.com/questions/10352404/lucene-net-boost-not-working-when-using-wildcard

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!