Understanding Elasticsearch synonym

无人久伴 提交于 2019-12-25 08:35:35

问题


Being very new in Elasticsearch, I'm not sure what's the best way to use synonym.

I have two fields, one is hashtag and another one is name. Hashtag containing names in lower case without whitespace whereas name contains actual name in camel case format.

I want to search based on name in the right format and want to get all matching names along with those docs where it matches hashtag as well.

For example, name contains "Tom Cruise" and hashtag is "tomcruise". I want to search "Tom Cruise" and expected result is it will return all docs which has either name "Tom Cruise" or hashtag "tomcruise".

Here is the way I'm creating this index:

PUT /my_index
{
"settings": {
    "number_of_shards": 1, 
    "analysis": {
        "filter": {
            "synonym" : {
                "type" : "synonym",
                "ignore_case" : true,
                "synonyms" : [
                    "tom cruise => tomcruise, tom cruise"
                ]
            }
        },
        "analyzer": {
            "synonym" : {
                "tokenizer" : "whitespace",
                "filter" : ["synonym"]
            }
        }
    }
}
}

 PUT /my_index/my_type/_mapping
{
"my_type": {
    "properties": {
        "hashtag": {
            "type":            "string",
            "search_analyzer": "synonym",
            "analyzer": "standard"
        },
        "name":{
          "type": "keyword"
        }
    }
}
}


POST /my_index/my_type/_bulk
{ "index": { "_id": 1            }}
{ "hashtag": "tomcruise", "name": "abc"    }
{ "index": { "_id": 2            }}
{ "hashtag": "tomhanks", "name": "efg" }
{ "index": { "_id": 3            }}
{ "hashtag": "tomcruise"  , "name": "efg"  }
{ "index": { "_id": 4            }}
{ "hashtag": "news" , "name": "Tom Cruise"}
{ "index": { "_id": 5            }}
{ "hashtag": "celebrity", "name": "Kate Winslet"    }
{ "index": { "_id": 6            }}
{ "hashtag": "celebrity", "name": "Tom Cruise" }

When I do analyze, it looks like I get the right tokens: [tomcruise, tom, cruise]

GET /my_index/_analyze
{
  "text": "Tom Cruise",
  "analyzer": "synonym"
}

Here's how I'm searching:

POST /my_index/my_type/_search?pretty
{
  "query": 
  {
    "multi_match": {
        "query":    "Tom Cruise",
        "fields": [ "hashtag", "name" ]
    }
  }
}
  • Is this the right way to archive my search requirement?
  • What's the best way to search like this on Kibana? I have to use the entire query but what I need to do if I want to just type "Tom Cruise" and want to get the expected result? I tried with "_all" but didn't work.

Updated:

After discussing with Russ Cam and with my little knowledge of Elasticsearch, I thought it will be overkill to use synonym for my search requirement. So I changed search analyzer to generate same token and got the same result. Still want to know whether I'm doing it in the right way.

PUT /my_index
{
    "settings": {
        "number_of_shards": 1, 
        "analysis": {
            "filter": {
                "word_joiner": {
                    "type": "word_delimiter",
                    "catenate_all": true
                }
            },
            "analyzer": {
                "test_analyzer" : {
                    "type": "custom",
                    "tokenizer" : "keyword",
                    "filter" : ["lowercase", "word_joiner"]
                }
            }
        }
    }
}

来源:https://stackoverflow.com/questions/39060966/understanding-elasticsearch-synonym

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!