Find documents with empty string value on elasticsearch

前端 未结 12 2123
不知归路
不知归路 2020-12-03 09:46

I\'ve been trying to filter with elasticsearch only those documents that contains an empty string in its body. So far I\'m having no luck.

Before I go on, I should

相关标签:
12条回答
  • 2020-12-03 10:26

    I'm using Elasticsearch 5.3 and was having trouble with some of the above answers.

    The following body worked for me.

     {
        "query": {
            "bool" : {
                "must" : {
                    "script" : {
                        "script" : {
                            "inline": "doc['city'].empty",
                            "lang": "painless"
                         }
                    }
                }
            }
        }
    }
    

    Note: you might need to enable the fielddata for text fields, it is disabled by default. Although I would read this: https://www.elastic.co/guide/en/elasticsearch/reference/current/fielddata.html before doing so.

    To enable the fielddata for a field e.g. 'city' on index 'business' with type name 'record' you need:

    PUT business/_mapping/record
    {
        "properties": {
            "city": {
              "type": "text",
              "fielddata": true
            }
          }
    }
    
    0 讨论(0)
  • 2020-12-03 10:26

    For nested fields use:

    curl -XGET "http://localhost:9200/city/_search?pretty=true" -d '{
         "query" : {
             "nested" : {
                 "path" : "country",
                 "score_mode" : "avg",
                 "query" : {
                     "bool": {
                        "must_not": {
                            "exists": {
                                "field": "country.name" 
                            }
                        }
                     }
                 }
             }
         }
    }'
    

    NOTE: path and field together constitute for search. Change as required for you to work.

    For regular fields:

    curl -XGET 'http://localhost:9200/city/_search?pretty=true' -d'{
        "query": {
            "bool": {
                "must_not": {
                    "exists": {
                        "field": "name"
                    } 
                } 
            } 
        } 
    }'
    
    0 讨论(0)
  • 2020-12-03 10:26

    You need to trigger the keyword indexer by adding .content to your field name. Depending on how the original index was set up, the following "just works" for me using AWS ElasticSearch v6.x.

    GET /my_idx/_search?q=my_field.content:""

    0 讨论(0)
  • 2020-12-03 10:27

    If you are using the default analyzer (standard) there is nothing for it to analyze if it is an empty string. So you need to index the field verbatim (not analyzed). Here is an example:

    Add a mapping that will index the field untokenized, if you need a tokenized copy of the field indexed as well you can use a Multi Field type.

    PUT http://localhost:9200/test/_mapping/demo
    {
      "demo": {
        "properties": {
          "_content": {
            "type": "string",
            "index": "not_analyzed"
          }
        }
      }
    }
    

    Next, index a couple of documents.

    /POST http://localhost:9200/test/demo/1/
    {
      "_content": ""
    }
    
    /POST http://localhost:9200/test/demo/2
    {
      "_content": "some content"
    }
    

    Execute a search:

    POST http://localhost:9200/test/demo/_search
    {
      "query": {
        "filtered": {
          "filter": {
            "term": {
              "_content": ""
            }
          }
        }
      }
    }
    

    Returns the document with the empty string.

    {
        took: 2,
        timed_out: false,
        _shards: {
            total: 5,
            successful: 5,
            failed: 0
        },
        hits: {
            total: 1,
            max_score: 0.30685282,
            hits: [
                {
                    _index: test,
                    _type: demo,
                    _id: 1,
                    _score: 0.30685282,
                    _source: {
                        _content: ""
                    }
                }
            ]
        }
    }
    
    0 讨论(0)
  • 2020-12-03 10:28

    I didn't manage to search for empty strings in a text field. However it seems to work with a field of type keyword. So I suggest the following:

        delete /test_idx
    
        put test_idx
        {
          "mappings" : {
            "testMapping": {
              "properties" : {
                "tag" : {"type":"text"},
                "content" : {"type":"text",
                             "fields" : {
                               "x" : {"type" : "keyword"}
                             }
                }
              }
            }
          }
        }
    
    put /test_idx/testMapping/1
    {
      "tag": "null"
    }
    
    put /test_idx/testMapping/2
    {
      "tag": "empty",
      "content": ""
    }
    
    GET /test_idx/testMapping/_search
    {
       "query" : {
         "match" : {"content.x" : ""}}}
                 }
    }
    
    0 讨论(0)
  • 2020-12-03 10:34

    If you don't want to or can't re-index there is another way. :-)

    You can use the negation operator and a wildcard to match any non-blank string *

    GET /my_index/_search?q=!(fieldToLookFor:*)
    
    0 讨论(0)
提交回复
热议问题