I\'ve been trying to filter with elasticsearch only those documents that contains an empty string in its body. So far I\'m having no luck.
Before I go on, I should
I'm using Elasticsearch 5.3 and was having trouble with some of the above answers.
The following body worked for me.
{
"query": {
"bool" : {
"must" : {
"script" : {
"script" : {
"inline": "doc['city'].empty",
"lang": "painless"
}
}
}
}
}
}
Note: you might need to enable the fielddata for text fields, it is disabled by default. Although I would read this: https://www.elastic.co/guide/en/elasticsearch/reference/current/fielddata.html before doing so.
To enable the fielddata for a field e.g. 'city' on index 'business' with type name 'record' you need:
PUT business/_mapping/record
{
"properties": {
"city": {
"type": "text",
"fielddata": true
}
}
}
For nested fields use:
curl -XGET "http://localhost:9200/city/_search?pretty=true" -d '{
"query" : {
"nested" : {
"path" : "country",
"score_mode" : "avg",
"query" : {
"bool": {
"must_not": {
"exists": {
"field": "country.name"
}
}
}
}
}
}
}'
NOTE: path and field together constitute for search. Change as required for you to work.
For regular fields:
curl -XGET 'http://localhost:9200/city/_search?pretty=true' -d'{
"query": {
"bool": {
"must_not": {
"exists": {
"field": "name"
}
}
}
}
}'
You need to trigger the keyword indexer by adding .content to your field name. Depending on how the original index was set up, the following "just works" for me using AWS ElasticSearch v6.x.
GET /my_idx/_search?q=my_field.content:""
If you are using the default analyzer (standard) there is nothing for it to analyze if it is an empty string. So you need to index the field verbatim (not analyzed). Here is an example:
Add a mapping that will index the field untokenized, if you need a tokenized copy of the field indexed as well you can use a Multi Field type.
PUT http://localhost:9200/test/_mapping/demo
{
"demo": {
"properties": {
"_content": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
Next, index a couple of documents.
/POST http://localhost:9200/test/demo/1/
{
"_content": ""
}
/POST http://localhost:9200/test/demo/2
{
"_content": "some content"
}
Execute a search:
POST http://localhost:9200/test/demo/_search
{
"query": {
"filtered": {
"filter": {
"term": {
"_content": ""
}
}
}
}
}
Returns the document with the empty string.
{
took: 2,
timed_out: false,
_shards: {
total: 5,
successful: 5,
failed: 0
},
hits: {
total: 1,
max_score: 0.30685282,
hits: [
{
_index: test,
_type: demo,
_id: 1,
_score: 0.30685282,
_source: {
_content: ""
}
}
]
}
}
I didn't manage to search for empty strings in a text field. However it seems to work with a field of type keyword. So I suggest the following:
delete /test_idx
put test_idx
{
"mappings" : {
"testMapping": {
"properties" : {
"tag" : {"type":"text"},
"content" : {"type":"text",
"fields" : {
"x" : {"type" : "keyword"}
}
}
}
}
}
}
put /test_idx/testMapping/1
{
"tag": "null"
}
put /test_idx/testMapping/2
{
"tag": "empty",
"content": ""
}
GET /test_idx/testMapping/_search
{
"query" : {
"match" : {"content.x" : ""}}}
}
}
If you don't want to or can't re-index there is another way. :-)
You can use the negation operator and a wildcard to match any non-blank string *
GET /my_index/_search?q=!(fieldToLookFor:*)