问题
What I want to achieve is aggregation by unique pairs (city, STATE). As per Elasticsearch documentation The terms aggregation does not support collecting terms from multiple fields in the same document. Thus I created a nested agg like this:
{
"size": 0,
"aggs": {
"cities": {
"terms": {
"field": "address.city",
"size": 12
},
"aggs": {
"states": {
"terms": {
"field": "address.stateOrProvince"
},
"aggs": {
"topCity": {
"top_hits": {
"size": 1,
"sort": [
{
"price.value": {
"order": "desc" }}]}}}}}}}}
As a result of this aggregation I get response like this:
{
"aggregations": {
"cities": {
"buckets": [
{
"key": "las vegas",
"doc_count": 5927,
"states": {
"buckets": [
{ "key": "nv", "doc_count": 5840 },
{ "key": "nm", "doc_count": 85 }
]
}
},
{
"key": "jacksonville",
"doc_count": 5689,
"states": {
"buckets": [
{ "key": "fl", "doc_count": 2986 },
{ "key": "nc", "doc_count": 1962 },
{ "key": "ar", "doc_count": 290 }]}}]}}}
The question is how to get results ordered by the deepest doc_count?
Expected ordered list should be like this:
- las vegas, nv (5840)
- jacksonville, fl (2986)
- jacksonville, nc (1962)
- jacksonville, ar (290)
- las vegas, nm (85)
回答1:
I don't believe there is a way to sort on the inner doc_count accross multiple buckets. In ES 2.0 (still in Beta) you'll be able to take action on aggregations but that's not possible in ES 1.x
回答2:
I've managed to solve the problem by applying transform
"transform": {
"script": "ctx._source['address']['cityState'] = ctx._source['address']['city'] + ', ' + ctx._source['address']['state']"
}
and then aggregating on the newly added field. Works as expected!
来源:https://stackoverflow.com/questions/32908712/elasticsearch-aggregation-order-by-nested-bucket-doc-count