ElasticSearch count multiple fields grouped by

 ̄綄美尐妖づ 提交于 2021-01-01 06:31:00

问题


I have documents like

{"domain":"US", "zipcode":"11111", "eventType":"click", "id":"1", "time":100}

{"domain":"US", "zipcode":"22222", "eventType":"sell", "id":"2", "time":200}

{"domain":"US", "zipcode":"22222", "eventType":"click", "id":"3","time":150}

{"domain":"US", "zipcode":"11111", "eventType":"sell", "id":"4","time":350}

{"domain":"US", "zipcode":"33333", "eventType":"sell", "id":"5","time":225}

{"domain":"EU", "zipcode":"44444", "eventType":"click", "id":"5","time":120}

I want to filter these documents by eventType=sell and time between 125 and 400, group by domain followed by zipcode and count the documents in each bucket. So my output would be like (first and last docs would be ignored by the filters)

US, 11111,1

US, 22222,1

US, 33333,1

In SQL, this should have been straightforward. But I am not able to get this to work on ElasticSearch. Could someone please help me out here?

How do I write ElasticSearch query to accomplish the above?


回答1:


This query seems to do what you want:

POST /test_index/_search
{
   "size": 0,
   "query": {
      "filtered": {
         "filter": {
            "bool": {
               "must": [
                  {
                     "term": {
                        "eventType": "sell"
                     }
                  },
                  {
                     "range": {
                        "time": {
                           "gte": 125,
                           "lte": 400
                        }
                     }
                  }
               ]
            }
         }
      }
   },
   "aggs": {
      "zipcode_terms": {
         "terms": {
            "field": "zipcode"
         }
      }
   }
}

returning

{
   "took": 8,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 3,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "zipcode_terms": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": "11111",
               "doc_count": 1
            },
            {
               "key": "22222",
               "doc_count": 1
            },
            {
               "key": "33333",
               "doc_count": 1
            }
         ]
      }
   }
}

(Note that there is only 1 "sell" at "22222", not 2).

Here is some code I used to test it:

http://sense.qbox.io/gist/1c4cb591ab72a6f3ae681df30fe023ddfca4225b

You might want to take a look at terms aggregations, the bool filter, and range filters.

EDIT: I just realized I left out the domain part, but it should be straightforward to add in a bucket aggregation on that as well if you need to.



来源:https://stackoverflow.com/questions/34191810/elasticsearch-count-multiple-fields-grouped-by

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!