Elasticsearch - Choosing the analyzer to use for fields

问题

How do I tell query_string which analyzer to use in a search?

I've created my index with an analyzer like so:

"analysis": {
  "analyzer": {
    "std_analyzer": {
      "tokenizer": "whitespace",
      "filter": [ "stemmer" ]
    }
  }
}

I do not predefine any mappings. Instead, I rely mappings to be dynamically added upon inserting a document.

The mappings appear like so after calling /my_index/_mapping

      "short_bio" : {
        "type" : "text",
        "fields" : {
          "keyword" : {
            "type" : "keyword",
            "ignore_above" : 256
          }
        }
      },

You will see there no analyzer defined in the mapping when the field is added dynamically.

Does this mean searching will automatically use the analyzer that was created with the index (std_analyzer)? Or is some other analyzer used? How do I force it to use the analyzer that I want?

If's relevant, I'm searching using query_string to take advantage of AND/OR/NOT/grouping

Thanks!

回答1:

Please refer query string's analyzer explanation from the official docs

(Optional, string) Analyzer used to convert text in the query string into tokens. Defaults to the index-time analyzer mapped for the default_field. If no analyzer is mapped, the index’s default analyzer is used.

It means in your case, as you have not defined any explicit analyzer, query string will use the standard analyzer for text fields and keyword aka no-op analyzer for keyword fields.

Also don't be confused with index's default analyzer, you can simply check this by following this official link.

Also as mentioned in the docs, query string returns errors for invalid syntax and your use-case of AND/OR/NOT can be easily handled by preferred boolean query

来源：https://stackoverflow.com/questions/65574089/elasticsearch-choosing-the-analyzer-to-use-for-fields

标签

ElasticSearch

elasticsearch-dsl