Elasticsearch “starts with” first word in phrases

筅森魡賤 提交于 2019-12-20 11:55:23

问题


I try to implement an A - Z navigation for my content with Elasticsearch. What I need, is displaying all results which begins with e.g. a,b,c,... etc.

I've tried:

"query": {
        "match_phrase_prefix" : {
        "title" : {
            "query" : "a"
        }
      }
    }

The query mentioned above also display results, where within the string a word begins with a. Example:

"title": "Apfelpfannkuchen",

"title": "Affogato",

"title": "Kalbsschnitzel an Aceto Balsamico",

I want to display only phrase where the FIRST word begins with a.

Here the mapping I use:

$params = array(
            'index' => 'my_index',
            'body' => array(
                'settings' => array(
                    'number_of_shards' => 1,
                    'index' => array(
                        'analysis' => array(
                            'filter' => array(
                                'nGram_filter' => array(
                                    'type' => 'nGram',
                                    'min_gram' => 2,
                                    'max_gram' => 20,
                                    'token_chars' => array('letter', 'digit', 'punctuation', 'symbol')
                                )
                            ),
                            'analyzer' => array(
                                'nGram_analyzer' => array(
                                    'type' => 'custom',
                                    'tokenizer' => 'whitespace',
                                    'filter' => array('lowercase', 'asciifolding', 'nGram_filter')
                                ),
                                'whitespace_analyzer' => array(
                                    'type' => 'custom',
                                    'tokenizer' => 'whitespace',
                                    'filter' => array('lowercase', 'asciifolding')
                                ),
                                'analyzer_startswith' => array(
                                    'tokenizer' => 'keyword',
                                    'filter' => 'lowercase'
                                )
                            )
                        )
                    )
                ),
                'mappings' => array(
                    'tags' => array(
                        '_all' => array(
                            'type' => 'string',
                            'index_analyzer' => 'nGram_analyzer',
                            'search_analyzer' => 'whitespace_analyzer'
                        ),
                        'properties' => array()

                    ),
                    'posts' => array(
                        '_all' => array(
                            'index_analyzer' => 'nGram_analyzer',
                            'search_analyzer' => 'whitespace_analyzer'
                        ),
                        'properties' => array(
                            'title' => array(
                                'type' => 'string',
                                'index_analyzer' => 'analyzer_startswith',
                                'search_analyzer' => 'analyzer_startswith'
                            )
                        )
                    )
                )
            )
        );

回答1:


If you are using default mapping then it will not work for you.

You need to use keyword tokenizer and lowercase filter in mapping.

Mapping Will be :

{
    "settings": {
        "index": {
            "analysis": {
                "analyzer": {
                    "analyzer_startswith": {
                        "tokenizer": "keyword",
                        "filter": "lowercase"
                    }
                }
            }
        }
    },
    "mappings": {
        "test_index": {
            "properties": {
                "title": {
                    "search_analyzer": "analyzer_startswith",
                    "index_analyzer": "analyzer_startswith",
                    "type": "string"
                }
            }
        }
    }
}

Search query on test_index :

{
    "query": {
        "match_phrase_prefix": {
            "title": {
                "query": "a"
            }
        }
    }
}

It will return all post title starting with a




回答2:


I'm updating @Roopendra 's answer according to this gist. Thus, there was an update and in recent versions search and index initializers seem not to work, there were replaced only to initializers, also string needs to be replaced to text.

Thus, we have the following mapping file:

{
    "settings": {
        "index": {
            "analysis": {
                "analyzer": {
                    "analyzer_startswith": {
                        "tokenizer": "keyword",
                        "filter": "lowercase"
                    }
                }
            }
        }
    },
    "mappings": {
        "test_index": {
            "properties": {
                "title": {
                    "analyzer": "analyzer_startswith",
                    "type": "text"
                }
            }
        }
    }
}

With the following query:

{
    "query": {
        "match_phrase_prefix": {
            "title": {
                "query": "a",
                "max_expansions": 100
            }
        }
    }
}

I added max_expansions to the query because the default value seems to be 5 so I was getting erroneous results, in your case the value might be higher.



来源:https://stackoverflow.com/questions/29741641/elasticsearch-starts-with-first-word-in-phrases

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!