问题
I have a table which records every status change of an entity
id recordTime Status
ID1 2014-03-01 11:33:00 Disconnected
ID1 2014-03-01 12:13:00 Connected
ID2 2014-03-01 12:21:00 Connected
ID1 2014-03-01 12:24:00 Disconnected
ID1 2014-03-01 12:29:00 Connected
ID2 2014-03-01 12:40:00 Disconnected
ID2 2014-03-01 13:03:00 Connected
ID2 2014-03-01 13:13:00 Disconnected
ID2 2014-03-01 13:29:00 Connected
ID1 2014-03-01 13:30:00 Disconnected
I need to calculate the total inactive time i.e time between 'Connected' and last 'Disconnected' status per ID for a given time window.
For above table and time range of 2014-03-01 11:00:00 to 2014-03-01 14:00:00 the output should be:
ID InactiveTime
ID1 01:15:00
ID2 02:00:00
回答1:
The special difficulty is not to miss the time spans to the outer time frame.
Assuming that the next row for any given id
always has the opposite status.
Using the column name ts
instead of recordTime
:
WITH span AS (
SELECT '2014-03-01 13:00'::timestamp AS s_from -- start of time range
, '2014-03-01 14:00'::timestamp AS s_to -- end of time range
)
, cte AS (
SELECT id, ts, status, s_to
, lead(ts, 1, s_from) OVER w AS span_start
, first_value(ts) OVER w AS last_ts
FROM span s
JOIN tbl t ON t.ts BETWEEN s.s_from AND s.s_to
WINDOW w AS (PARTITION BY id ORDER BY ts DESC)
)
SELECT id, sum(time_disconnected)::text AS total_disconnected
FROM (
SELECT id, ts - span_start AS time_disconnected
FROM cte
WHERE status = 'Connected'
UNION ALL
SELECT id, s_to - ts
FROM cte
WHERE status = 'Disconnected'
AND ts = last_ts
) sub
GROUP BY 1
ORDER BY 1;
Returns intervals as requested.
IDs without entries in the selected time range don't show up. You would have to query them additionally.
SQL Fiddle.
Note: I cast the resulting total_disconnected
to text
in the fiddle, because the type interval
is displayed in a terrible format.
Add IDs without entry in the selected time frame
Per request in comment.
Add to the query above (before the final ORDER BY 1
):
...
UNION ALL
SELECT id, total_disconnected
FROM (
SELECT DISTINCT ON (id)
t.id, t.status, (s.s_to - s.s_from)::text AS total_disconnected
FROM span s
JOIN tbl t ON t.ts < s.s_from -- only from before time range
LEFT JOIN cte c USING (id)
WHERE c.id IS NULL -- not represented in selected time frame
ORDER BY t.id, t.ts DESC -- only the latest entry
) sub
WHERE status = 'Disconnected' -- only if disconnected
ORDER BY 1;
SQL Fiddle.
Now, only IDs without entries in or before the selected time range don't show up.
回答2:
This is how I understand your question SQL Fiddle
select id, sum(diff) as inactive
from (
select
recordtime,
recordTime -
lag(recordTime, 1, recordTime)
over(
partition by id
order by recordTime
)
as diff,
status,
id
from t
) s
where status = 'Connected'
group by id
order by id
;
id | inactive
----+----------
1 | 00:45:00
2 | 00:39:00
Could you explain your desired output?
回答3:
select id , sum(diff) inactif_time
from
(
SELECT id, "recordTime", "Status" ,LEAD("recordTime") OVER(PARTITION BY id order by "recordTime" ),LEAD("recordTime") OVER(PARTITION BY id order by "recordTime" ) - "recordTime" diff
FROM my_table
) B
where "Status" = 'Disconnected'
group by id
But it outputs:
"ID1";"00:45:00"
"ID2";"00:39:00"
来源:https://stackoverflow.com/questions/22114645/sum-of-time-difference-between-rows