问题
I want to do aggregate on presto sql by looking back x hours/minutes/seconds ago.
Data
id | timestamp | status
-------------------------------------------
A | 2018-01-01 03:00:00 | GOOD
A | 2018-01-01 04:00:00 | BAD
A | 2018-01-01 05:00:00 | GOOD
A | 2018-01-01 09:00:00 | BAD
A | 2018-01-01 09:15:00 | BAD
A | 2018-01-01 13:00:00 | GOOD
A | 2018-01-01 14:00:00 | GOOD
B | 2018-02-01 09:00:00 | GOOD
B | 2018-02-01 10:00:00 | BAD
Results:
id | timestamp | status | bad_status_count
----------------------------------------------------------------
A | 2018-01-01 03:00:00 | GOOD | 0
A | 2018-01-01 04:00:00 | BAD | 1
A | 2018-01-01 05:00:00 | GOOD | 1
A | 2018-01-01 09:00:00 | BAD | 1
A | 2018-01-01 09:15:00 | BAD | 2
A | 2018-01-01 13:00:00 | GOOD | 0
A | 2018-01-01 14:00:00 | GOOD | 0
B | 2018-02-01 09:00:00 | GOOD | 0
B | 2018-02-01 10:00:00 | BAD | 1
I am counting bad status over the period of last 3 hours by business. How can I do that? I am trying something like this:
SELECT
id,
timestamp,
status
count(status) over(partition by id order by timestamp range between interval '3' hour and current_row) as bad_status_count
from table
Of course it doesnt work yet and I still have to filter out for bad status. I got this error:
Error running query: line 7:1: Window frame start value type must be INTEGER or BIGINT(actual interval day to second)
回答1:
I'm not 100% how to represent express this in PrestoDB, but the key idea is to convert the timestamps to hours:
select t.*,
sum(case when status = 'Bad' then 1 else 0 end) over
(partition by id
order by hours
range between -3 and current row
) as bad_status
from (select t.*,
date_diff(hour, '2000-01-01', timestamp) as hours
from t
) t;
来源:https://stackoverflow.com/questions/54233225/presto-sql-window-aggregate-looking-back-x-hours-minutes-seconds