Querying elasticsearch returns all documents

蓝咒 提交于 2019-12-11 08:49:10

问题


i wonder why a search for a specific term returns all documents of an index and not the documents containing the requested term.

Here's the index and how i set it up: (using the elasticsearch head-plugin browser-interface)

{
  "settings": {
    "number_of_replicas": 1,
    "number_of_shards": 1,
    "analysis": {
      "filter": {
        "dutch_stemmer": {
          "type": "dictionary_decompounder",
          "word_list": [
            "koud",
            "plaat",
            "staal",
            "fabriek"
          ]
        },
        "snowball_nl": {
          "type": "snowball",
          "language": "dutch"
        }
      },
      "analyzer": {
        "dutch": {
          "tokenizer": "standard",
          "filter": [
            "length",
            "lowercase",
            "asciifolding",
            "dutch_stemmer",
            "snowball_nl"
          ]
        }
      }
    }
  }
}

{
  "properties": {
    "test": {
      "type": "string",
      "fields": {
        "dutch": {
          "type": "string",
          "analyzer": "dutch"
        }
      }
    }
  }
}

Then i added some docs:

{"test": "ijskoud"}
{"test": "plaatstaal"}
{"test": "kristalfabriek"}

So now when firing a search for "plaat" somehow one would expect the search would come back with the document containing "plaatstaal".

{
  "match": {
    "test": "plaat"
  }
}

However saving me further searches elasticsearch retuns all documents regardless of its text content. Is there anything I am missing here? Funny enough: there is a difference when using GET or POST. While using the latter brings back no hits, GET returns all documents.

Any help is much appreciated.


回答1:


You need to configure your index to use your custom analyzer:

PUT /some_index
{
  "settings": {
     ...
  },
  "mappings": {
    "doc": {
      "properties": {
        "test": {
          "type": "string",
          "analyzer": "dutch"
        }
      }
    }
  }
}

If you have more fields that use this analyzer and don't want to specify for each the analyzer, you can do it like this for a specific type in that index:

  "mappings": {
    "doc": {
      "analyzer": "dutch"
    }
  }

If you want ALL your types in that index to use your custom analyzer:

  "mappings": {
    "_default_": {
      "analyzer": "dutch"
    }
  }

To test your analyzer in a simple way:

GET /some_index/_analyze?text=plaatstaal&analyzer=dutch

This would be the full list of steps to perform:

DELETE /some_index

PUT /some_index
{
  "settings": {
    "number_of_replicas": 1,
    "number_of_shards": 1,
    "analysis": {
      "filter": {
        "dutch_stemmer": {
          "type": "dictionary_decompounder",
          "word_list": [
            "koud",
            "plaat",
            "staal",
            "fabriek"
          ]
        },
        "snowball_nl": {
          "type": "snowball",
          "language": "dutch"
        }
      },
      "analyzer": {
        "dutch": {
          "tokenizer": "standard",
          "filter": [
            "length",
            "lowercase",
            "asciifolding",
            "dutch_stemmer",
            "snowball_nl"
          ]
        }
      }
    }
  },
  "mappings": {
    "doc": {
      "properties": {
        "test": {
          "type": "string",
          "analyzer": "dutch"
        }
      }
    }
  }
}

POST /some_index/doc/_bulk
{"index":{}}
{"test": "ijskoud"}
{"index":{}}
{"test": "plaatstaal"}
{"index":{}}
{"test": "kristalfabriek"}

GET /some_index/doc/_search
{
  "query": {
    "match": {
      "test": "plaat"
    }
  }
}

And the result of the search:

{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 1.987628,
      "hits": [
         {
            "_index": "some_index",
            "_type": "doc",
            "_id": "jlGkoJWoQfiVGiuT_TUCpg",
            "_score": 1.987628,
            "_source": {
               "test": "plaatstaal"
            }
         }
      ]
   }
}



回答2:


When you are using GET you do not pass the request body, so search is performed without any filter and all documents are returned.

When you are using POST your search query does get passed on. It doesn't return anything probably because your document is not getting analyzed as you intended it to.



来源:https://stackoverflow.com/questions/26502397/querying-elasticsearch-returns-all-documents

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!