Flattening intersecting timespans

前端 未结 7 1434
温柔的废话
温柔的废话 2020-12-14 19:56

I have lots of data with start and stop times for a given ID and I need to flatten all intersecting and adjacent timespans into one combined timespan. The sample data posted

7条回答
  •  盖世英雄少女心
    2020-12-14 20:43

    Here is a recursive CTE solution, but I took the liberty of assigning a date and time to each column rather than pulling the date out separately. Helps to avoid some messy special case code. If you must store the date separately, I would use a view of CTE to make it look like two datetime columns and go with this approach.

    create test data:

    create table t1 (d1 datetime, d2 datetime)
    
    insert t1 (d1,d2)
        select           '2009-06-03 10:00:00', '2009-06-03 14:00:00'
        union all select '2009-06-03 13:55:00', '2009-06-03 18:00:00'
        union all select '2009-06-03 17:55:00', '2009-06-03 23:00:00'
        union all select '2009-06-03 22:55:00', '2009-06-04 03:00:00'
    
        union all select '2009-06-04 03:05:00', '2009-06-04 07:00:00'
    
        union all select '2009-06-04 07:05:00', '2009-06-04 10:00:00'
        union all select '2009-06-04 09:55:00', '2009-06-04 14:00:00'
    

    Recursive CTE:

    ;with dateRanges (ancestorD1, parentD1, d2, iter) as
    (
    --anchor is first level of collapse
        select
            d1 as ancestorD1,
            d1 as parentD1,
            d2,
            cast(0 as int) as iter
        from t1
    
    --recurse as long as there is another range to fold in
        union all select
            tLeft.ancestorD1,
            tRight.d1 as parentD1,
            tRight.d2,
            iter + 1  as iter
        from dateRanges as tLeft join t1 as tRight
            --join condition is that the t1 row can be consumed by the recursive row
            on tLeft.d2 between tRight.d1 and tRight.d2
                --exclude identical rows
                and not (tLeft.parentD1 = tRight.d1 and tLeft.d2 = tRight.d2)
    )
    select
        ranges1.*
    from dateRanges as ranges1
    where not exists (
        select 1
        from dateRanges as ranges2
        where ranges1.ancestorD1 between ranges2.ancestorD1 and ranges2.d2
            and ranges1.d2 between ranges2.ancestorD1 and ranges2.d2
            and ranges2.iter > ranges1.iter
    )
    

    Gives output:

    ancestorD1              parentD1                d2                      iter
    ----------------------- ----------------------- ----------------------- -----------
    2009-06-04 03:05:00.000 2009-06-04 03:05:00.000 2009-06-04 07:00:00.000 0
    2009-06-04 07:05:00.000 2009-06-04 09:55:00.000 2009-06-04 14:00:00.000 1
    2009-06-03 10:00:00.000 2009-06-03 22:55:00.000 2009-06-04 03:00:00.000 3
    

提交回复
热议问题