How to group time column into 5 second intervals and count rows using Presto?

夙愿已清 提交于 2019-12-20 04:52:48

问题


I am using Presto and Zeppelin. There are a lot of raw datas. I have to summarize those datas.

I wanna group time every 5 seconds.

serviceType        logType     date
------------------------------------------------------
service1           log1        2017-10-24 23:00:23.206
service1           log1        2017-10-24 23:00:23.207
service1           log1        2017-10-24 23:00:25.206
service2           log1        2017-10-24 23:00:24.206
service1           log2        2017-10-24 23:00:27.206
service1           log2        2017-10-24 23:00:29.302

then the result

serviceType        logType     date                       cnt
--------------------------------------------------------------
service1           log1        2017-10-24 23:00:20          2
service2           log1        2017-10-24 23:00:20          1
service1           log1        2017-10-24 23:00:25          1
service1           log2        2017-10-24 23:00:25          2

first, I have to migrate stored datas to new tables.

second, I have to group datas and save to the new table realtime.

It's hard to write sql script.

Please help me.

Do I have to use python interpreter?


回答1:


You can

  1. discard millisecond part of a timestamp with date_trunc
  2. you can round a timestamp without millisecond part to 5 seconds with ts - interval '1' second * (second(ts) % 5)

Example putting this together:

presto> SELECT ts_rounded, count(*)
     -> FROM (
     ->     SELECT date_trunc('second', ts) - interval '1' second * (second(ts) % 5) AS ts_rounded
     ->     FROM (VALUES timestamp '2017-10-24 23:01:20.206',
     ->         timestamp '2017-10-24 23:01:23.206',
     ->         timestamp '2017-10-24 23:01:23.207',
     ->         timestamp '2017-10-24 23:01:26.206') AS t(ts)
     -> )
     -> GROUP BY ts_rounded ORDER BY ts_rounded;
       ts_rounded        | _col1
-------------------------+-------
 2017-10-24 23:01:20.000 |     3
 2017-10-24 23:01:25.000 |     1
(2 rows)


来源:https://stackoverflow.com/questions/47066024/how-to-group-time-column-into-5-second-intervals-and-count-rows-using-presto

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!