Elasticsearch - group by day of week and hour

强颜欢笑 提交于 2019-12-06 03:41:39

问题


I need to do get some data grouped by day of week and hour, for example

curl -XGET http://localhost:9200/testing/hello/_search?pretty=true -d '
{
        "size": 0,
        "aggs": {
          "articles_over_time" : {
            "date_histogram" : {
                "field" : "date",
                "interval" : "hour",
                "format": "E - k"
            }
          }
        }
}
'

Gives me this:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 2857,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "articles_over_time" : {
      "buckets" : [ {
        "key_as_string" : "Fri - 17",
        "key" : 1391792400000,
        "doc_count" : 6
      },
     ...
      {
        "key_as_string" : "Wed - 22",
        "key" : 1411596000000,
        "doc_count" : 1
      }, {
        "key_as_string" : "Wed - 22",
        "key" : 1411632000000,
        "doc_count" : 1
      } ]
    }
  }
}

Now I need to summarize doc counts by this value "Wed - 22", how can I do this? Maybe some another approach?


回答1:


The same kind of problem has been solved in this thread.

Adapting the solution to your problem, we need to make a script to convert the date into the hour of day and day of week:

Date date = new Date(doc['date'].value) ; 
java.text.SimpleDateFormat format = new java.text.SimpleDateFormat('EEE, HH');
format.format(date)

And use it in a query:

{
    "aggs": {
        "perWeekDay": {
            "terms": {
                "script": "Date date = new Date(doc['date'].value) ;java.text.SimpleDateFormat format = new java.text.SimpleDateFormat('EEE, HH');format.format(date)"
            }
        }
    }
}



回答2:


You can try doing terms aggregation on "key_as_string" field from the aggregation results using sub aggregation.

Hope that helps.




回答3:


This is because you are using an interval of 'hour', but, the date format is 'day' (E - k).

Change your interval to be 'day', and you'll no longer get separate buckets for 'Weds - 22'.

Or, if you do want per hour, then change your format to include the hour field.



来源:https://stackoverflow.com/questions/26060180/elasticsearch-group-by-day-of-week-and-hour

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!