Trouble using ROW_NUMBER() OVER (PARTITION BY …)

后端 未结 3 841
花落未央
花落未央 2021-02-02 14:43

I\'m using SQL Server 2008 R2. I have table called EmployeeHistory with the following structure and sample data:

EmployeeID Date      DepartmentID SupervisorID
1         


        
3条回答
  •  忘掉有多难
    2021-02-02 15:22

    It looks like a common gaps-and-islands problem. The difference between two sequences of row numbers rn1 and rn2 give the "group" number.

    Run this query CTE-by-CTE and examine intermediate results to see how it works.

    Sample data

    I expanded sample data from the question a little.

    DECLARE @Source TABLE
    (
        EmployeeID int,
        DateStarted date,
        DepartmentID int
    )
    
    INSERT INTO @Source
    VALUES
    (10001,'2013-01-01',001),
    (10001,'2013-09-09',001),
    (10001,'2013-12-01',002),
    (10001,'2014-05-01',002),
    (10001,'2014-10-01',001),
    (10001,'2014-12-01',001),
    
    (10005,'2013-05-01',001),
    (10005,'2013-11-09',001),
    (10005,'2013-12-01',002),
    (10005,'2014-10-01',001),
    (10005,'2016-12-01',001);
    

    Query for SQL Server 2008

    There is no LEAD function in SQL Server 2008, so I had to use self-join via OUTER APPLY to get the value of the "next" row for the DateEnd.

    WITH
    CTE
    AS
    (
        SELECT
            EmployeeID
            ,DateStarted
            ,DepartmentID
            ,ROW_NUMBER() OVER (PARTITION BY EmployeeID ORDER BY DateStarted) AS rn1
            ,ROW_NUMBER() OVER (PARTITION BY EmployeeID, DepartmentID ORDER BY DateStarted) AS rn2
        FROM @Source
    )
    ,CTE_Groups
    AS
    (
        SELECT
            EmployeeID
            ,MIN(DateStarted) AS DateStart
            ,DepartmentID
        FROM CTE
        GROUP BY
            EmployeeID
            ,DepartmentID
            ,rn1 - rn2
    )
    SELECT
        CTE_Groups.EmployeeID
        ,CTE_Groups.DepartmentID
        ,CTE_Groups.DateStart
        ,A.DateEnd
    FROM
        CTE_Groups
        OUTER APPLY
        (
            SELECT TOP(1) G2.DateStart AS DateEnd
            FROM CTE_Groups AS G2
            WHERE
                G2.EmployeeID = CTE_Groups.EmployeeID
                AND G2.DateStart > CTE_Groups.DateStart
            ORDER BY G2.DateStart
        ) AS A
    ORDER BY
        EmployeeID
        ,DateStart
    ;
    

    Query for SQL Server 2012+

    Starting with SQL Server 2012 there is a LEAD function that makes this task more efficient.

    WITH
    CTE
    AS
    (
        SELECT
            EmployeeID
            ,DateStarted
            ,DepartmentID
            ,ROW_NUMBER() OVER (PARTITION BY EmployeeID ORDER BY DateStarted) AS rn1
            ,ROW_NUMBER() OVER (PARTITION BY EmployeeID, DepartmentID ORDER BY DateStarted) AS rn2
        FROM @Source
    )
    ,CTE_Groups
    AS
    (
        SELECT
            EmployeeID
            ,MIN(DateStarted) AS DateStart
            ,DepartmentID
        FROM CTE
        GROUP BY
            EmployeeID
            ,DepartmentID
            ,rn1 - rn2
    )
    SELECT
        CTE_Groups.EmployeeID
        ,CTE_Groups.DepartmentID
        ,CTE_Groups.DateStart
        ,LEAD(CTE_Groups.DateStart) OVER (PARTITION BY CTE_Groups.EmployeeID ORDER BY CTE_Groups.DateStart) AS DateEnd
    FROM
        CTE_Groups
    ORDER BY
        EmployeeID
        ,DateStart
    ;
    

    Result

    +------------+--------------+------------+------------+
    | EmployeeID | DepartmentID | DateStart  |  DateEnd   |
    +------------+--------------+------------+------------+
    |      10001 |            1 | 2013-01-01 | 2013-12-01 |
    |      10001 |            2 | 2013-12-01 | 2014-10-01 |
    |      10001 |            1 | 2014-10-01 | NULL       |
    |      10005 |            1 | 2013-05-01 | 2013-12-01 |
    |      10005 |            2 | 2013-12-01 | 2014-10-01 |
    |      10005 |            1 | 2014-10-01 | NULL       |
    +------------+--------------+------------+------------+
    

提交回复
热议问题