How to merge time intervals in SQL Server

前端 未结 7 1765
再見小時候
再見小時候 2021-01-03 01:43

Suppose I have the following an event table with personId, startDate and endDate.

I want to know how much time the person X sp

7条回答
  •  我在风中等你
    2021-01-03 02:21

    Edit 1: I have modified both solutions to get correct results.

    Edit 2: I have done comparative tests using the solutions proposed by Mikael Eriksson, Conrad Frix, Philip Kelley and me. All tests use an EventTable with the following structure:

    CREATE TABLE EventTable
    (
         EventID    INT IDENTITY PRIMARY KEY
        ,PersonId   INT NOT NULL
        ,StartDate  DATETIME NOT NULL
        ,EndDate    DATETIME NOT NULL
        ,CONSTRAINT CK_StartDate_Before_EndDate CHECK(StartDate < EndDate)
    );
    

    Also, all tests use warm buffer (no DBCC DROPCLEANBUFFERS) and cold [plan] cache (I have executed DBCC FREEPROCCACHE before every test). Because some solutions use a filter(PersonId = 1) and others not, I have inserted into EventTable rows for only one person (INSERT ...(PersonId,...) VALUES (1,...)).

    These are the results: enter image description here

    My solutions use recursive CTEs.

    Solution 1:

    WITH BaseCTE
    AS
    (
        SELECT   e.StartDate
                ,e.EndDate
                ,e.PersonId
                ,ROW_NUMBER() OVER(PARTITION BY e.PersonId ORDER BY e.StartDate, e.EndDate) RowNumber
        FROM    EventTable e
    ),  RecursiveCTE
    AS
    (
        SELECT   b.PersonId
                ,b.RowNumber
    
                ,b.StartDate
                ,b.EndDate
                ,b.EndDate AS MaxEndDate
                ,1 AS PseudoDenseRank
        FROM    BaseCTE b
        WHERE   b.RowNumber = 1
        UNION ALL
        SELECT   crt.PersonId
                ,crt.RowNumber
    
                ,crt.StartDate
                ,crt.EndDate
                ,CASE WHEN crt.EndDate > prev.MaxEndDate THEN crt.EndDate ELSE prev.MaxEndDate END
                ,CASE WHEN crt.StartDate <= prev.MaxEndDate THEN prev.PseudoDenseRank ELSE prev.PseudoDenseRank + 1 END
        FROM    RecursiveCTE prev
        INNER JOIN BaseCTE crt ON prev.PersonId = crt.PersonId
        AND     prev.RowNumber + 1 = crt.RowNumber
    ),  SumDaysPerPersonAndInterval
    AS
    (
        SELECT   src.PersonId
                ,src.PseudoDenseRank --Interval ID
                ,DATEDIFF(DAY, MIN(src.StartDate), MAX(src.EndDate)) Days
        FROM    RecursiveCTE src
        GROUP BY src.PersonId, src.PseudoDenseRank
    )
    SELECT  x.PersonId, SUM( x.Days ) DaysPerPerson
    FROM    SumDaysPerPersonAndInterval x
    GROUP BY x.PersonId
    OPTION(MAXRECURSION 32767);
    

    Solution 2:

    DECLARE @Base TABLE --or a temporary table: CREATE TABLE #Base (...) 
    (
         PersonID   INT NOT NULL
        ,StartDate  DATETIME NOT NULL
        ,EndDate    DATETIME NOT NULL
        ,RowNumber  INT NOT NULL
        ,PRIMARY KEY(PersonID, RowNumber)
    );
    INSERT  @Base (PersonID, StartDate, EndDate, RowNumber)
    SELECT   e.PersonId
            ,e.StartDate
            ,e.EndDate
            ,ROW_NUMBER() OVER(PARTITION BY e.PersonID ORDER BY e.StartDate, e.EndDate) RowNumber
    FROM    EventTable e;
    
    WITH RecursiveCTE
    AS
    (
        SELECT   b.PersonId
                ,b.RowNumber
    
                ,b.StartDate
                ,b.EndDate
                ,b.EndDate AS MaxEndDate
                ,1 AS PseudoDenseRank
        FROM    @Base b
        WHERE   b.RowNumber = 1
        UNION ALL
        SELECT   crt.PersonId
                ,crt.RowNumber
    
                ,crt.StartDate
                ,crt.EndDate
                ,CASE WHEN crt.EndDate > prev.MaxEndDate THEN crt.EndDate ELSE prev.MaxEndDate END
                ,CASE WHEN crt.StartDate <= prev.MaxEndDate THEN prev.PseudoDenseRank ELSE prev.PseudoDenseRank + 1 END
        FROM    RecursiveCTE prev
        INNER JOIN @Base crt ON prev.PersonId = crt.PersonId
        AND     prev.RowNumber + 1 = crt.RowNumber
    ),  SumDaysPerPersonAndInterval
    AS
    (
        SELECT   src.PersonId
                ,src.PseudoDenseRank --Interval ID
                ,DATEDIFF(DAY, MIN(src.StartDate), MAX(src.EndDate)) Days
        FROM    RecursiveCTE src
        GROUP BY src.PersonId, src.PseudoDenseRank
    )
    SELECT  x.PersonId, SUM( x.Days ) DaysPerPerson
    FROM    SumDaysPerPersonAndInterval x
    GROUP BY x.PersonId
    OPTION(MAXRECURSION 32767);
    

提交回复
热议问题