问题
I wanted to aggregate the data on a different field and also wanted to get the aggregated data on sorted fashion based on the name.
My data is :
{
"_index": "testing-aggregation",
"_type": "employee",
"_id": "emp001_local000000000000001",
"_score": 10.0,
"_source": {
"name": [
"Person 01"
],
"groupbyid": [
"group0001"
],
"ranking": [
"2.0"
]
}
},
{
"_index": "testing-aggregation",
"_type": "employee",
"_id": "emp002_local000000000000001",
"_score": 85146.375,
"_source": {
"name": [
"Person 02"
],
"groupbyid": [
"group0001"
],
"ranking": [
"10.0"
]
}
},
{
"_index": "testing-aggregation",
"_type": "employee",
"_id": "emp003_local000000000000001",
"_score": 20.0,
"_source": {
"name": [
"Person 03"
],
"groupbyid": [
"group0002"
],
"ranking": [
"-1.0"
]
}
},
{
"_index": "testing-aggregation",
"_type": "employee",
"_id": "emp004_local000000000000001",
"_score": 5.0,
"_source": {
"name": [
"Person 04"
],
"groupbyid": [
"group0002"
],
"ranking": [
"2.0"
]
}
}
My query :
{
"size": 0,
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "name:emp*^1000.0"
}
}
]
}
},
"aggs": {
"groupbyid": {
"terms": {
"field": "groupbyid.raw",
"order": {
"top_hit_agg": "desc"
},
"size": 10
},
"aggs": {
"top_hit_agg": {
"terms": {
"field": "name"
}
}
}
}
}
}
My mapping is :
{
"name": {
"type": "text",
"fielddata": true,
"fields": {
"lower_case_sort": {
"type": "text",
"fielddata": true,
"analyzer": "case_insensitive_sort"
}
}
},
"groupbyid": {
"type": "text",
"fielddata": true,
"index": "analyzed",
"fields": {
"raw": {
"type": "keyword",
"index": "not_analyzed"
}
}
}
}
I am getting data based on the average of the relevance of grouped records. Now, what I wanted is the first club the records based on the groupid and then in each bucket sort the data based on the name field.
I wanted grouping on one field and after that grouped bucket, I want to sort on another field. This is sample data.
There are other fields like created_on, updated_on. I also wanted to get sorted data based on that field. also get the data by alphabetically grouped.
I wanted to sort on the non-numeric data type(string). I can do the numeric data type.
I can do it for the ranking field but not able to do it for the name field. It was giving the below error.
Expected numeric type on field [name], but got [text];
回答1:
You're asking for a few things, so I'll try to answer them in turn.
Step 1: Sorting buckets by relevance
I am getting data based on the average of the relevance of grouped records.
If this is what you're attempting to do, it's not what the aggregation you wrote is doing. Terms aggregations default to sorting the buckets by the number of documents in each bucket, descending. To sort the groups by "average relevance" (which I'll interpret as "average _score
of documents in the group"), you'd need to add a sub-aggregation on the score and sort the terms aggregation by that:
"aggregations": {
"most_relevant_groups": {
"terms": {
"field": "groupbyid.raw",
"order": {
"average_score": "desc"
}
},
"aggs": {
"average_score": {
"avg": {
"script": {
"inline": "_score",
"lang": "painless",
}
}
}
}
}
}
Step 2: Sorting employees by name
Now, what I wanted is the first club the records based on the groupid and then in each bucket sort the data based on the name field.
To sort the documents within each bucket, you can use a top_hits
aggregation:
"aggregations": {
"most_relevant_groups": {
"terms": {
"field": "groupbyid.raw",
"order": {
"average_score": "desc"
}
},
"aggs": {
"employees": {
"top_hits": {
"size": 10, // Default will be 10 - change to whatever
"sort": [
{
"name.lower_case_sort": {
"order": "asc"
}
}
]
}
}
}
}
}
Step 3: Putting it all together
Putting the both the above together, the following aggregation should suit your needs (note that I used a function_score query to simulate "relevance" based on ranking - your query can be whatever and just needs to be any query that produces whatever relevance you need):
POST /testing-aggregation/employee/_search
{
"size": 0,
"query": {
"function_score": {
"functions": [
{
"field_value_factor": {
"field": "ranking"
}
}
]
}
},
"aggs": {
"groupbyid": {
"terms": {
"field": "groupbyid.raw",
"size": 10,
"order": {
"average_score": "desc"
}
},
"aggs": {
"average_score": {
"avg": {
"script": {
"inline": "_score",
"lang": "painless"
}
}
},
"employees": {
"top_hits": {
"size": 10,
"sort": [
{
"name.lower_case_sort": {
"order": "asc"
}
}
]
}
}
}
}
}
}
来源:https://stackoverflow.com/questions/60539380/elasticsearch-aggregation-sorting-in-on-nonnumric-field-5-3