Sum duration of overlapping periods with priority by excluding the overlap itself

前端 未结 2 1450
攒了一身酷
攒了一身酷 2021-01-22 19:16

I have an R code and I am trying to rewrite it in PostgreSQL that feeds grafana dashboard. I do have the basics so I am almost done with the other parts of the script but what I

2条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-01-22 19:48

    This is a type of gaps-and-islands problem. To solve this, find where the "islands" begin and then aggregate. So, to get the islands:

    select a.name, min(start) as startt, max("end") as endt
    from (select a.*,
                 count(*) filter (where prev_end is null or prev_end < start) over (partition by name order by start, id) as grp
          from (select a.*,
                       max("end") over (partition by name
                                        order by start, id
                                        rows between unbounded preceding and 1 preceding
                                       ) as prev_end
                from activities a
               ) a
         ) a
    group by name, grp;
    

    The next step is just to aggregate again:

    with islands as (
          select a.name, min(start) as startt, max("end") as endt
          from (select a.*,
                       count(*) filter (where prev_end is null or prev_end < start) over (partition by name order by start, id) as grp
                from (select a.*,
                             max("end") over (partition by name
                                              order by start, id
                                              rows between unbounded preceding and 1 preceding
                                             ) as prev_end
                      from activities a
                     ) a
               ) a
          group by name, grp
         )
    select name, sum(endt - startt)
    from islands i
    group by name;
    

    Here is a db<>fiddle.

    Note that this uses a cumulative trailing maximum to define the overlaps. This is the most general method for determining overlaps. I think this will work on all edge cases, including:

    1----------2---2----3--3-----1
    

    It also handles ties on the start time.

提交回复
热议问题