how to return the count of unique documents by using elasticsearch aggregation

后端 未结 1 1242
日久生厌
日久生厌 2020-12-10 12:54

I encountered a problem that elasticsearch could not return the count of unique documents by just using terms aggregation on a nested field.

Here is an example of ou

相关标签:
1条回答
  • 2020-12-10 13:56

    I think you need a reverse_nested aggregation, because you want aggregation based on a nested value, but actually counting the ROOT documents, not the nested ones

    {
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "last_name": "smith"
              }
            }
          ]
        }
      },
      "aggs": {
        "location": {
          "nested": {
            "path": "location"
          },
          "aggs": {
            "state": {
              "terms": {
                "field": "location.state",
                "size": 10
              },
              "aggs": {
                "top_reverse_nested": {
                  "reverse_nested": {}
                }
              }
            }
          }
        }
      }
    }
    

    And, as a result, you would see something like this:

    "aggregations": {
          "location": {
             "doc_count": 6,
             "state": {
                "doc_count_error_upper_bound": 0,
                "sum_other_doc_count": 0,
                "buckets": [
                   {
                      "key": "ny",
                      "doc_count": 4,
                      "top_reverse_nested": {
                         "doc_count": 2
                      }
                   },
                   {
                      "key": "ca",
                      "doc_count": 2,
                      "top_reverse_nested": {
                         "doc_count": 2
                      }
                   }
                ]
             }
          }
       }
    

    And what you are looking for is under top_reverse_nested part. One point here: if I'm not mistaking "doc_count": 6 is the NESTED document count, so don't be confused about these numbers thinking you are counting root documents, the count is on the nested ones. So, for a document with three nested ones that match, the count would be 3, not 1.

    0 讨论(0)
提交回复
热议问题