Sunspot — Boost records where matches occur early in the text

流过昼夜 提交于 2020-01-03 16:44:56

问题


For example, let's say there is a record in my DB that has the text "Hormel Corporation" and my search term is something like "Hormel Corned Beef 16 Ounces". As my current configuration stands, the top results will be other records, even though "Hormel Corporation" is the one I'm looking for. I think the solution to my problem would be to give priority to records where a match comes earliest in the search term. I've read all the docs, but I have had trouble figuring out how this might work.

I only have one field -- name. That name field for the record I want reads "Hormel Corporation", however when I search the "Hormel Corned Beef 16 Ounces", the top result is something that ISNT "Hormel Corporation," but something seemingly random, while the record I'm looking for is 3rd or 4th in the results.

Thanks a lot!


回答1:


I had a similar problem to solve. So I stored my data in many fields:

title
keywords (upto 10 words)
abstract (a paragraph)
text (as long as you like)

For querying, I used the dismax query parser over the fields with different weights:

title^20
keywords^20
abstract^12
text^1

So if you

  1. define your data schema well
  2. use dismax
  3. determine per-field weights for your queries

when you search "Hormel Corned Beef 16 Ounces", a result whose title is "Hormel Corp" will score better a document whose body contains "...For the dish, we reccomend a can of Hormel Corned Beef 16 Ounces..."


Edit on OP's comments.

OP's fact is: given a title of n words, the first n words matter more than the rest.

I suggest a data model in which there are two fields: title_first_words and title. The client application (sorry, you cannot directly use DIH) will have to extract the first n words from title to store into title_first_words and the full title is stored to title.

For searching, you can give the entire query to the dismax parser. The query parser is theb biased to title_first_words like title_first_words^4 title^1. Thus the first n words will make a bigger impact for a given search.




回答2:


Have you tried to boost importance of each word in search term like:

Hormel^100 Corned^20 Beef^5 16^2 Ounces^1


来源:https://stackoverflow.com/questions/9101478/sunspot-boost-records-where-matches-occur-early-in-the-text

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!