Range ElasticSearch Aggregation

浪子不回头ぞ 提交于 2020-02-25 03:49:08

问题


I need to compute a pipeline aggregation in ElasticSearch and I can't figure out how to express it.

Each document has an email address and an amount. I need to output range buckets of amount counts, grouped by unique email.

{ "0 - 99": 300, "100 - 400": 100 ...}

Would basically be the expected output (the keys would be transformed in my application code), indicating that 300 unique emails have cumulatively received at least 99 (amount) across all documents.

Intuitively, I would expect a query like below. However, range does not appear to be a buckets aggregation (or allow buckets_path).

What is the correct approach here?

{
 aggs: {
   users: {
     terms: {
       field: "email"
     },
     aggs: {
       amount_received: {
         sum: {
           field: "amount"
         }
       }
     }
   },
   amount_ranges: {
     range: {
       buckets_path: "users>amount_received",
       ranges: [
           { to: 99.0 },
           { from: 100.0, to: 299.0 },
           { from: 300.0, to: 599.0 },
           { from: 600.0 }
       ]
     }
   }
}
  }

回答1:


There's no pipeline aggregation that does that directly. However, I think I came up with a solution that should suit your needs, it goes like this. The idea is to repeat the same terms/sum aggregation and then use a bucket_selector pipeline aggregation for each of the ranges you're interested in.

POST index/_search
{
  "size": 0,
  "aggs": {
    "users_99": {
      "terms": {
        "field": "email",
        "size": 1000
      },
      "aggs": {
        "amount_received": {
          "sum": {
            "field": "amount"
          }
        },
        "-99": {
          "bucket_selector": {
            "buckets_path": {
              "amountReceived": "amount_received"
            },
            "script": "params.amountReceived < 100"
          }
        }
      }
    },
    "users_100_299": {
      "terms": {
        "field": "email",
        "size": 1000
      },
      "aggs": {
        "amount_received": {
          "sum": {
            "field": "amount"
          }
        },
        "100-299": {
          "bucket_selector": {
            "buckets_path": {
              "amountReceived": "amount_received"
            },
            "script": "params.amountReceived >= 100 && params.amountReceived < 300"
          }
        }
      }
    },
    "users_300_599": {
      "terms": {
        "field": "email",
        "size": 1000
      },
      "aggs": {
        "amount_received": {
          "sum": {
            "field": "amount"
          }
        },
        "300-599": {
          "bucket_selector": {
            "buckets_path": {
              "amountReceived": "amount_received"
            },
            "script": "params.amountReceived >= 300 && params.amountReceived < 600"
          }
        }
      }
    },
    "users_600": {
      "terms": {
        "field": "email",
        "size": 1000
      },
      "aggs": {
        "amount_received": {
          "sum": {
            "field": "amount"
          }
        },
        "600": {
          "bucket_selector": {
            "buckets_path": {
              "amountReceived": "amount_received"
            },
            "script": "params.amountReceived >= 600"
          }
        }
      }
    }
  }
}

In the results, the number of buckets in the users_99 will be the number of unique emails that have an amount less than 99. Similarly, users_100_299 will contain as many buckets as there are unique emails with amounts between 100 and 300. And so on...



来源:https://stackoverflow.com/questions/50937480/range-elasticsearch-aggregation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!