问题
I want to calculate the difference of nested aggregations between two dates.
To be more concrete is it possible to calculate the difference between date_1.buckets.field_1.buckets.field_2.buckets.field_3.value - date_2.buckets.field_1.buckets.field_2.buckets.field_3.value
given the below request/response. Is that possible with elasticsearch v.1.0.1?
The aggregation query request looks like this:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"terms": {
"date": [
"2014-08-18 00:00:00.0",
"2014-08-15 00:00:00.0"
]
}
}
]
}
}
}
},
"aggs": {
"date_1": {
"filter": {
"terms": {
"date": [
"2014-08-18 00:00:00.0"
]
}
},
"aggs": {
"my_agg_1": {
"terms": {
"field": "field_1",
"size": 2147483647,
"order": {
"_term": "desc"
}
},
"aggs": {
"my_agg_2": {
"terms": {
"field": "field_2",
"size": 2147483647,
"order": {
"_term": "desc"
}
},
"aggs": {
"my_agg_3": {
"sum": {
"field": "field_3"
}
}
}
}
}
}
}
},
"date_2": {
"filter": {
"terms": {
"date": [
"2014-08-15 00:00:00.0"
]
}
},
"aggs": {
"my_agg_1": {
"terms": {
"field": "field_1",
"size": 2147483647,
"order": {
"_term": "desc"
}
},
"aggs": {
"my_agg_1": {
"terms": {
"field": "field_2",
"size": 2147483647,
"order": {
"_term": "desc"
}
},
"aggs": {
"my_agg_3": {
"sum": {
"field": "field_3"
}
}
}
}
}
}
}
}
}
}
And the response looks like this:
{
"took": 236,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1646,
"max_score": 0,
"hits": []
},
"aggregations": {
"date_1": {
"doc_count": 823,
"field_1": {
"buckets": [
{
"key": "field_1_key_1",
"doc_count": 719,
"field_2": {
"buckets": [
{
"key": "key_1",
"doc_count": 275,
"field_3": {
"value": 100
}
}
]
}
}
]
}
},
"date_2": {
"doc_count": 823,
"field_1": {
"buckets": [
{
"key": "field_1_key_1",
"doc_count": 719,
"field_2": {
"buckets": [
{
"key": "key_1",
"doc_count": 275,
"field_3": {
"value": 80
}
}
]
}
}
]
}
}
}
}
Thank you.
回答1:
No arithmetic operations are allowed between two aggregations' result from elasticsearch DSL, not even using scripts. (Upto version 1.1.1, at least I know)
Such operations need to be handeled in client side after processing the aggs result.
Reference
elasticsearch aggregation to sort by ratio of aggregations
回答2:
With elasticsearch new version (eg: 5.6.9) is possible:
{
"size": 0,
"query": {
"constant_score": {
"filter": {
"bool": {
"filter": [
{
"range": {
"date_created": {
"gte": "2018-06-16T00:00:00+02:00",
"lte": "2018-06-16T23:59:59+02:00"
}
}
}
]
}
}
}
},
"aggs": {
"by_millisec": {
"range" : {
"script" : {
"lang": "painless",
"source": "doc['date_delivered'][0] - doc['date_created'][0]"
},
"ranges" : [
{ "key": "<1sec", "to": 1000.0 },
{ "key": "1-5sec", "from": 1000.0, "to": 5000.0 },
{ "key": "5-30sec", "from": 5000.0, "to": 30000.0 },
{ "key": "30-60sec", "from": 30000.0, "to": 60000.0 },
{ "key": "1-2min", "from": 60000.0, "to": 120000.0 },
{ "key": "2-5min", "from": 120000.0, "to": 300000.0 },
{ "key": "5-10min", "from": 300000.0, "to": 600000.0 },
{ "key": ">10min", "from": 600000.0 }
]
}
}
}
}
回答3:
In 1.0.1 I couldn't find anything but in 1.4.2 you could try scripted_metric
aggregation (still experimental).
Here are the scripted_metric documentation page
I am not good with the elasticsearch syntax but I think your metric inputs would be:
init_script
- just initialize a accumulator for each date:
"init_script": "_agg.d1Val = 0; _agg.d2Val = 0;"
map_script
- test the date of the document and add to the right accumulator:
"map_script": "if (doc.date == firstDate) { _agg.d1Val += doc.field_3; } else { _agg.d2Val = doc.field_3;};",
reduce_script
- accumulate intermediate data from various shards and return the final results:
"reduce_script": "totalD1 = 0; totalD2 = 0; for (agg in _aggs) { totalD1 += agg.d1Val ; totalD2 += agg.d2Val ;}; return totalD1 - totalD2"
I don't think that in this case you need a combine_script
.
If course, if you can't use 1.4.2 than this is no help :-)
来源:https://stackoverflow.com/questions/25628761/how-to-calculate-difference-between-metrics-in-different-aggregations-in-elastic