问题
i wonder why a search for a specific term returns all documents of an index and not the documents containing the requested term.
Here's the index and how i set it up: (using the elasticsearch head-plugin browser-interface)
{
"settings": {
"number_of_replicas": 1,
"number_of_shards": 1,
"analysis": {
"filter": {
"dutch_stemmer": {
"type": "dictionary_decompounder",
"word_list": [
"koud",
"plaat",
"staal",
"fabriek"
]
},
"snowball_nl": {
"type": "snowball",
"language": "dutch"
}
},
"analyzer": {
"dutch": {
"tokenizer": "standard",
"filter": [
"length",
"lowercase",
"asciifolding",
"dutch_stemmer",
"snowball_nl"
]
}
}
}
}
}
{
"properties": {
"test": {
"type": "string",
"fields": {
"dutch": {
"type": "string",
"analyzer": "dutch"
}
}
}
}
}
Then i added some docs:
{"test": "ijskoud"}
{"test": "plaatstaal"}
{"test": "kristalfabriek"}
So now when firing a search for "plaat" somehow one would expect the search would come back with the document containing "plaatstaal".
{
"match": {
"test": "plaat"
}
}
However saving me further searches elasticsearch retuns all documents regardless of its text content. Is there anything I am missing here? Funny enough: there is a difference when using GET or POST. While using the latter brings back no hits, GET returns all documents.
Any help is much appreciated.
回答1:
You need to configure your index to use your custom analyzer:
PUT /some_index
{
"settings": {
...
},
"mappings": {
"doc": {
"properties": {
"test": {
"type": "string",
"analyzer": "dutch"
}
}
}
}
}
If you have more fields that use this analyzer and don't want to specify for each the analyzer, you can do it like this for a specific type in that index:
"mappings": {
"doc": {
"analyzer": "dutch"
}
}
If you want ALL your types in that index to use your custom analyzer:
"mappings": {
"_default_": {
"analyzer": "dutch"
}
}
To test your analyzer in a simple way:
GET /some_index/_analyze?text=plaatstaal&analyzer=dutch
This would be the full list of steps to perform:
DELETE /some_index
PUT /some_index
{
"settings": {
"number_of_replicas": 1,
"number_of_shards": 1,
"analysis": {
"filter": {
"dutch_stemmer": {
"type": "dictionary_decompounder",
"word_list": [
"koud",
"plaat",
"staal",
"fabriek"
]
},
"snowball_nl": {
"type": "snowball",
"language": "dutch"
}
},
"analyzer": {
"dutch": {
"tokenizer": "standard",
"filter": [
"length",
"lowercase",
"asciifolding",
"dutch_stemmer",
"snowball_nl"
]
}
}
}
},
"mappings": {
"doc": {
"properties": {
"test": {
"type": "string",
"analyzer": "dutch"
}
}
}
}
}
POST /some_index/doc/_bulk
{"index":{}}
{"test": "ijskoud"}
{"index":{}}
{"test": "plaatstaal"}
{"index":{}}
{"test": "kristalfabriek"}
GET /some_index/doc/_search
{
"query": {
"match": {
"test": "plaat"
}
}
}
And the result of the search:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.987628,
"hits": [
{
"_index": "some_index",
"_type": "doc",
"_id": "jlGkoJWoQfiVGiuT_TUCpg",
"_score": 1.987628,
"_source": {
"test": "plaatstaal"
}
}
]
}
}
回答2:
When you are using GET you do not pass the request body, so search is performed without any filter and all documents are returned.
When you are using POST your search query does get passed on. It doesn't return anything probably because your document is not getting analyzed as you intended it to.
来源:https://stackoverflow.com/questions/26502397/querying-elasticsearch-returns-all-documents