Elasticsearch aggregation. Order by nested bucket doc_count

核能气质少年 提交于 2019-12-31 04:30:30

问题


What I want to achieve is aggregation by unique pairs (city, STATE). As per Elasticsearch documentation The terms aggregation does not support collecting terms from multiple fields in the same document. Thus I created a nested agg like this:

{
  "size": 0,
  "aggs": {
    "cities": {
      "terms": {
        "field": "address.city",
        "size": 12
      },
      "aggs": {
        "states": {
          "terms": {
            "field": "address.stateOrProvince"
          },
          "aggs": {
            "topCity": {
              "top_hits": {
                "size": 1,
                "sort": [
                  {
                    "price.value": {
                      "order": "desc" }}]}}}}}}}}

As a result of this aggregation I get response like this:

{
  "aggregations": {
    "cities": {
      "buckets": [
        {
          "key": "las vegas",
          "doc_count": 5927,
          "states": {
            "buckets": [
              { "key": "nv", "doc_count": 5840 },
              { "key": "nm", "doc_count": 85 }
            ]
          }
        },
        {
          "key": "jacksonville",
          "doc_count": 5689,
          "states": {
            "buckets": [
              { "key": "fl", "doc_count": 2986 },
              { "key": "nc", "doc_count": 1962 },
              { "key": "ar", "doc_count": 290 }]}}]}}}

The question is how to get results ordered by the deepest doc_count?

Expected ordered list should be like this:

  1. las vegas, nv (5840)
  2. jacksonville, fl (2986)
  3. jacksonville, nc (1962)
  4. jacksonville, ar (290)
  5. las vegas, nm (85)

回答1:


I don't believe there is a way to sort on the inner doc_count accross multiple buckets. In ES 2.0 (still in Beta) you'll be able to take action on aggregations but that's not possible in ES 1.x




回答2:


I've managed to solve the problem by applying transform

"transform": {
  "script": "ctx._source['address']['cityState'] = ctx._source['address']['city'] + ', ' + ctx._source['address']['state']"
}

and then aggregating on the newly added field. Works as expected!



来源:https://stackoverflow.com/questions/32908712/elasticsearch-aggregation-order-by-nested-bucket-doc-count

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!