Show all Elasticsearch aggregation results/buckets and not just 10

佐手、 提交于 2019-11-28 14:29:34

问题


I'm trying to list all buckets on an aggregation, but it seems to be showing only the first 10.

My search:

curl -XPOST "http://localhost:9200/imoveis/_search?pretty=1" -d'
{
   "size": 0, 
   "aggregations": {
      "bairro_count": {
         "terms": {
            "field": "bairro.raw"
         }
      }
   }
}'

Returns:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 16920,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "bairro_count" : {
      "buckets" : [ {
        "key" : "Barra da Tijuca",
        "doc_count" : 5812
      }, {
        "key" : "Centro",
        "doc_count" : 1757
      }, {
        "key" : "Recreio dos Bandeirantes",
        "doc_count" : 1027
      }, {
        "key" : "Ipanema",
        "doc_count" : 927
      }, {
        "key" : "Copacabana",
        "doc_count" : 842
      }, {
        "key" : "Leblon",
        "doc_count" : 833
      }, {
        "key" : "Botafogo",
        "doc_count" : 594
      }, {
        "key" : "Campo Grande",
        "doc_count" : 456
      }, {
        "key" : "Tijuca",
        "doc_count" : 361
      }, {
        "key" : "Flamengo",
        "doc_count" : 328
      } ]
    }
  }
}

I have much more than 10 keys for this aggregation. In this example I'd have 145 keys, and I want the count for each of them. Is there some pagination on buckets? Can I get all of them?

I'm using Elasticsearch 1.1.0


回答1:


The size param should be a param for the terms query example:

curl -XPOST "http://localhost:9200/imoveis/_search?pretty=1" -d'
{
   "size": 0,
   "aggregations": {
      "bairro_count": {
         "terms": {
            "field": "bairro.raw",
             "size": 0
         }
      }
   }
}'

As mentioned in the doc works only for version 1.1.0 onwards

Edit

Updating the answer based on @PhaedrusTheGreek comment.

setting size:0 is deprecated in 2.x onwards, due to memory issues inflicted on your cluster with high-cardinality field values. You can read more about it in the github issue here .

It is recommended to explicitly set reasonable value for size a number between 1 to 2147483647.




回答2:


How to show all buckets?

{
  "size": 0,
  "aggs": {
    "aggregation_name": {
      "terms": {
        "field": "your_field",
        "size": 10000
      }
    }
  }
}

Note

  • "size":10000 Get at most 10000 buckets. Default is 10.

  • "size":0 In result, "hits" contains 10 documents by default. We don't need them.

  • By default, the buckets are ordered by the doc_count in decreasing order.


Why do I get Fielddata is disabled on text fields by default error?

Because fielddata is disabled on text fields by default. If you have not wxplicitly chosen a field type mapping, it has the default dynamic mappings for string fields.

So, instead of writing "field": "your_field" you need to have "field": "your_field.keyword".




回答3:


Increase the size(2nd size) to 10000 in your term aggregations and you will get the bucket of size 10000. By default it is set to 10. Also if you want to see the search results just make the 1st size to 1, you can see 1 document, since ES does support both searching and aggregation.

curl -XPOST "http://localhost:9200/imoveis/_search?pretty=1" -d'
{
   "size": 1,
   "aggregations": {
      "bairro_count": {
         "terms": {
             "field": "bairro.raw",
             "size": 10000

         }
      }
   }
}'


来源:https://stackoverflow.com/questions/22927098/show-all-elasticsearch-aggregation-results-buckets-and-not-just-10

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!