elasticsearch: search for parts of words

痴心易碎 提交于 2019-12-13 07:22:12

问题


I'm trying to learn how to use elasticsearch (using elasticsearch-php for queries). I have inserted a few data, which look something like this:

['id' => 1, 'name' => 'butter', 'category' => 'food'], 
['id' => 2,'name' => 'buttercup', 'category' => 'food'],
['id' => 3,'name' => 'something else', 'category' => 'butter'] 

Now I created a search query which looks like this:

$query = [
    'filtered' => [
        'query' => [
            'bool' => [
                'should' => [
                    ['match' => [
                        'name' => [
                            'query' => $val,
                            'boost' => 7
                        ]
                    ]],
                    ['match' => [
                        'category' => [
                            'query' => $val,
                            'boost' => 5
                        ]
                    ]],
                ],
            ]
        ]
    ]
];

where $val is the search term. This works nicely, the only problem I have: when I search for "butter", I find ids 1 and 3, but not 2, because the searchterm seems to match exact words only. Is there a way to search "within words", or, in mysql terms, to do something like WHERE name LIKE '%val%' ?


回答1:


You can try the wildcard query

$query = [
    'filtered' => [
        'query' => [
            'bool' => [
                'should' => [
                    ['wildcard' => [
                        'name' => [
                            'query' => '*'.$val.'*',
                            'boost' => 7
                        ]
                    ]],
                    ['wildcard' => [
                        'category' => [
                            'query' => '*'.$val.'*',
                            'boost' => 5
                        ]
                    ]],
                ],
            ]
        ]
    ]
];

or the query_string query.

$query = [
    'filtered' => [
        'query' => [
            'bool' => [
                'should' => [
                    ['query_string' => [
                        'default_field' => 'name',
                        'query' => '*'.$val.'*',
                        'boost' => 7
                    ]],
                    ['query_string' => [
                        'default_field' => 'category',
                        'query' => '*'.$val.'*',
                        'boost' => 7
                    ]],
                ],
            ]
        ]
    ]
];

Both will work but are not really performant if you have lots of data.

The correct way of doing this is to use a custom analyzer with a standard tokenizer and an ngram token filter in order to slice and dice each of your tokens into small ones.



来源:https://stackoverflow.com/questions/37315275/elasticsearch-search-for-parts-of-words

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!