Include parent _source fields in nested top hits aggregation

随声附和 提交于 2021-01-28 21:13:22

问题


I am trying to aggregate on a field and get the top records using top_ hits but I want to include other fields in the response which are not included in the nested property mapping. Currently if I specify _source:{"include":[]}, I am able to get only the fields which are in the current nested property.

Here is my mapping

{
  "my_cart":{
    "mappings":{
      "properties":{
        "store":{
          "properties":{
            "name":{
              "type":"keyword"
            }
          }
        },
        "sales":{
          "type":"nested",
          "properties":{
            "Price":{
              "type":"float"
            },
            "Time":{
              "type":"date"
            },
            "product":{
              "properties":{
                "name":{
                  "type":"text",
                  "fields":{
                    "keyword":{
                      "type":"keyword"
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}
  

UPDATE

Joe's answer solved my above issue.

My current issue in response is that though I am getting the product name as "key" and other details, But I am getting other product names as well in the hits which were part of that transaction in the billing receipt. I want to aggregate on the product's name and find last sold date of each product along with other details such as price,quantity, etc .

Current Response

"aggregations" : {
    "aggregate_by_most_sold_product" : {
      "doc_count" : 2878592,
      "all_products" : {
        "buckets" : [
          {
            "key" : "shampoo",
            "doc_count" : 1,
            "lastSold" : {
              "value" : 1.602569793E12,
              "value_as_string" : "2018-10-13T06:16:33.000Z"
            },
            "using_reverse_nested" : {
              "doc_count" : 1,
              "latest product" : {
                "hits" : {
                  "total" : {
                    "value" : 1,
                    "relation" : "eq"
                  },
                  "max_score" : 0.0,
                  "hits" : [
                    {
                      "_index" : "my_cart",
                      "_type" : "_doc",
                      "_id" : "36303258-9r7w-4b3e-ba3d-fhds7cfec7aa",
                      "_source" : {
                        "cashier" : {
                          "firstname" : "romeo",
                          "uuid" : "2828dhd-0911-7229-a4f8-8ab80dde86a6"
                        },
                       "product_price": {
                       "price":20,
                       "discount_offered":10
                        },

                        "sales" : [
                          {
                            "product" : {
                              "name" : "shampoo",
                               "time":"2018-10-13T04:44:26+00:00
                            },
                             "product" : {
                              "name" : "noodles",
                              "time":"2018-10-13T04:42:26+00:00
                            },
                              "product" : {
                              "name" : "biscuits",
                              "time":"2018-10-13T04:41:26+00:00
                            }
                            }
                            ]
                              }
                             }
                            ]
                             }
}
]


Expected Response

It gives me all product name's in that transaction which is increasing the bucket size. I only want single product name with the last date sold along with other details for each product.

My aggregation is same as Joe's aggregation in answer

Also my doubt is that can I also add scripts to perform actions on fields which I got in _source.

Ex:- price-discount_offered = Final amount.


回答1:


The nested context does not have access to the parent unless you use reverse_nested. In that case, however, you've lost the ability to only retrieve the applicable nested subdocument. But there is luckily a way to sort a terms aggregation by the result of a different, numeric one:

GET my_cart/_search
{
  "size": 0,
  "aggs": {
    "aggregate": {
      "nested": {
        "path": "sales"
      },
      "aggs": {
        "all_products": {
          "terms": {
            "field": "sales.product.name.keyword",
            "size": 6500,
            "order": {                                <--
              "lowest_date": "asc"
            }
          },
          "aggs": {
            "lowest_date": {                          <--
              "min": {
                "field": "sales.Time"
              }
            },
            "using_reverse_nested": {
              "reverse_nested": {},                   <--
              "aggs": {
                "latest product": {
                  "top_hits": {
                    "_source": {
                      "includes": [
                        "store.name"
                      ]
                    },
                    "size": 1
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

The caveat is that you won't be getting the store.name inside of the top_hits -- though I suspect you're probably already doing some post-processing on the client side where you could combine those entries:

"aggregate" : {
  ...
  "all_products" : {
    ...
    "buckets" : [
      {
        "key" : "myproduct",                     <--
        ...
        "using_reverse_nested" : {
          ...
          "latest product" : {
            "hits" : {
              ...
              "hits" : [
                {
                  ...
                  "_source" : {
                    "store" : {
                      "name" : "mystore"         <--
                    }
                  }
                }
              ]
            }
          }
        },
        "lowest_date" : {
          "value" : 1.4200704E12,
          "value_as_string" : "2015/01/01"       <--
        }
      }
    ]
  }
}


来源:https://stackoverflow.com/questions/64314470/include-parent-source-fields-in-nested-top-hits-aggregation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!