Elasticsearch sort based on the number of occurrences a string appears in an array

半世苍凉 提交于 2021-02-18 19:35:48

问题


I have an array field containig a list of strings: ie.: ["NY", "CA"]

At search time I have a filter which matches any of the strings in the array.

I would like to sort the results based on documents that have the most number of appearances of the searched string: "NY"

Results should include: document 1: ["CA", "NY", "NY"] document 2: ["NY", FL"] document 3: ["NY", CA", "NY", "NY"]

Results should be ordered as such

User 3, User 1, User 2

Is this possible? If so, how?


回答1:


For those curious, I was not able to boost based on how many occurrences of the word happen in the array. I did however accomplished what I needed with the following:

curl -X POST "http://localhost:9200/index/document/1" -d '{"id":1,"states_ties":["CA"],"state_abbreviation":"CA","worked_in_states":["CA"],"training_in_states":["CA"]}'
curl -X POST "http://localhost:9200/index/document/2" -d '{"id":2,"states_ties":["CA","NY"],"state_abbreviation":"FL","worked_in_states":["NY","CA"],"training_in_states":["NY","CA"]}'
curl -X POST "http://localhost:9200/index/document/3" -d '{"id":3,"states_ties":["CA","NY","FL"],"state_abbreviation":"NY","worked_in_states":["NY","CA"],"training_in_states":["NY","FL"]}'

curl -X GET 'http://localhost:9200/index/_search?per_page=10&pretty' -d '{
  "query": {
    "custom_filters_score": {
      "query": {
        "terms": {
          "states_ties": [
            "CA"
          ]
        }
      },
      "filters": [
        {
          "filter": {
            "term": {
              "state_abbreviation": "CA"
            }
          },
          "boost": 1.03
        },
        {
          "filter": {
            "terms": {
              "worked_in_states": [
                "CA"
              ]
            }
          },
          "boost": 1.02
        },
        {
          "filter": {
            "terms": {
              "training_in_states": [
                "CA"
              ]
            }
          },
          "boost": 1.01
        }
      ],
      "score_mode": "multiply"
    }
  },
  "sort": [
    {
      "_score": "desc"
    }
  ]
}'

results: id: score

1: 0.75584483
2: 0.73383
3: 0.7265643



回答2:


This would be accomplished by the standard Lucene scoring implementation. If you were simply searching for "NY", without specifying an order, it will sort by relevance, and will assign highest relevance to a document with more occurances of the term, all else being equal.



来源:https://stackoverflow.com/questions/15330948/elasticsearch-sort-based-on-the-number-of-occurrences-a-string-appears-in-an-arr

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!