问题
I need to make row numbering with ordering, partitioning and grouping. Ordering by IdDocument, DateChange, partitioning by IdDocument and grouping by IdRole. The problem is in grouping especially. As it could be seen from the example (NumberingExpected) DENSE_RANK() must be the best function for this purpose but it makes repetition of numbering only when the values which are used to order are the same. In my case values used for ordering (IdDocument, DateChange) are always different and repetition of numbering must be done by IdRole.
Sure it could be solved by the usage of cursor very easy. But is there any way to make it with numbering/ranking functions?
Test data:
declare @LogTest as table (
Id INT
,IdRole INT
,DateChange DATETIME
,IdDocument INT
,NumberingExpected INT
)
insert into @LogTest
select 1 as Id, 7 as IdRole, GETDATE() as DateChange, 13 as IdDocument, 1 as NumberingExpected
union
select 2, 3, DATEADD(HH, 1, GETDATE()), 13, 2
union
select 3, 3, DATEADD(HH, 2, GETDATE()), 13, 2
union
select 4, 3, DATEADD(HH, 3, GETDATE()), 13, 2
union
select 5, 5, DATEADD(HH, 4, GETDATE()), 13, 3
union
select 7, 3, DATEADD(HH, 6, GETDATE()), 13, 4
union
select 6, 3, DATEADD(HH, 5, GETDATE()), 27, 1
union
select 8, 3, DATEADD(HH, 7, GETDATE()), 27, 1
union
select 9, 5, DATEADD(HH, 8, GETDATE()), 27, 2
union
select 10, 3, DATEADD(HH, 9, GETDATE()), 27, 3
select * from @LogTest order by IdDocument, DateChange;
Explanation in terms of functional programming:
- Order data by IdDocument, DateChange
- Set first row number as i=1 go to next row
- If IdDocument has changed { i=1; } else { If IdRow has changed { i++; } }
- set row number as i;
- go to the next row;
- IF EOF { exit; } else { go to step 3; }
回答1:
Since 2012 you could use LAG/LEAD, but in 2008 it is not available, so we'll emulate it. Performance could be poor, you should check with your actual data.
This is the final query:
WITH
CTE_rn
AS
(
SELECT
Main.IdRole
,Main.IdDocument
,Main.DateChange
,ROW_NUMBER() OVER(PARTITION BY Main.IdDocument ORDER BY Main.DateChange) AS rn
FROM
@LogTest AS Main
OUTER APPLY
(
SELECT TOP (1) T.IdRole
FROM @LogTest AS T
WHERE
T.IdDocument = Main.IdDocument
AND T.DateChange < Main.DateChange
ORDER BY T.DateChange DESC
) AS Prev
WHERE Main.IdRole <> Prev.IdRole OR Prev.IdRole IS NULL
)
SELECT *
FROM
@LogTest AS LT
CROSS APPLY
(
SELECT TOP(1) CTE_rn.rn
FROM CTE_rn
WHERE
CTE_rn.IdDocument = LT.IdDocument
AND CTE_rn.IdRole = LT.IdRole
AND CTE_rn.DateChange <= LT.DateChange
ORDER BY CTE_rn.DateChange DESC
) CA_rn
ORDER BY IdDocument, DateChange;
Final Result set:
Id IdRole DateChange IdDocument NumberingExpected rn
1 7 2015-01-26 20:00:41.210 13 1 1
2 3 2015-01-26 21:00:41.210 13 2 2
3 3 2015-01-26 22:00:41.210 13 2 2
4 3 2015-01-26 23:00:41.210 13 2 2
5 5 2015-01-27 00:00:41.210 13 3 3
7 3 2015-01-27 02:00:41.210 13 4 4
6 3 2015-01-27 01:00:41.210 27 1 1
8 3 2015-01-27 03:00:41.210 27 1 1
9 5 2015-01-27 04:00:41.210 27 2 2
10 3 2015-01-27 05:00:41.210 27 3 3
How it works
1) We need the value of IdRole from the previous row when the table is ordered by IdDocument and DateChange. To get it we use OUTER APPLY (because LAG is not available):
SELECT *
FROM
@LogTest AS Main
OUTER APPLY
(
SELECT TOP (1) T.IdRole
FROM @LogTest AS T
WHERE
T.IdDocument = Main.IdDocument
AND T.DateChange < Main.DateChange
ORDER BY T.DateChange DESC
) AS Prev
ORDER BY Main.IdDocument, Main.DateChange;
This is result set of this first step:
Id IdRole DateChange IdDocument NumberingExpected IdRole
1 7 2015-01-26 20:50:32.560 13 1 NULL
2 3 2015-01-26 21:50:32.560 13 2 7
3 3 2015-01-26 22:50:32.560 13 2 3
4 3 2015-01-26 23:50:32.560 13 2 3
5 5 2015-01-27 00:50:32.560 13 3 3
7 3 2015-01-27 02:50:32.560 13 4 5
6 3 2015-01-27 01:50:32.560 27 1 NULL
8 3 2015-01-27 03:50:32.560 27 1 3
9 5 2015-01-27 04:50:32.560 27 2 3
10 3 2015-01-27 05:50:32.560 27 3 5
2) We want to remove rows with repeating IdRole, so we add a WHERE and number the rows. You can see that row numbers follow the expected result:
SELECT
Main.IdRole
,Main.IdDocument
,Main.DateChange
,ROW_NUMBER() OVER(PARTITION BY Main.IdDocument ORDER BY Main.DateChange) AS rn
FROM
@LogTest AS Main
OUTER APPLY
(
SELECT TOP (1) T.IdRole
FROM @LogTest AS T
WHERE
T.IdDocument = Main.IdDocument
AND T.DateChange < Main.DateChange
ORDER BY T.DateChange DESC
) AS Prev
WHERE Main.IdRole <> Prev.IdRole OR Prev.IdRole IS NULL
;
This is result set of this step (it becomes the CTE):
IdRole IdDocument DateChange rn
7 13 2015-01-26 20:13:26.247 1
3 13 2015-01-26 21:13:26.247 2
5 13 2015-01-27 00:13:26.247 3
3 13 2015-01-27 02:13:26.247 4
3 27 2015-01-27 01:13:26.247 1
5 27 2015-01-27 04:13:26.247 2
3 27 2015-01-27 05:13:26.247 3
3) Finally, we need to get the correct row number from CTE for each row of the original table. I use CROSS APPLY to get one row from CTE for each row of the original table.
回答2:
This might not be pretty but it does create the required output.
; with cte as (
select l.Id,l.IdRole,l.IdDocument,l.NumberingExpected,l.DateChange,
(select min(x.DateChange) from @LogTest x where x.IdDocument = l.IdDocument and x.IdRole = l.IdRole and x.id<=l.id and
x.id > (select max(y.id) from @LogTest y where y.IdDocument = l.IdDocument and y.IdRole <> l.IdRole and y.id <=l.Id)) as DateChange2
from @LogTest l
)
select c.Id,c.IdRole,c.DateChange,c.IdDocument,c.NumberingExpected,dense_rank() over (partition by c.IdDocument order by c.DateChange2) as rn
from cte c order by c.IdDocument, c.DateChange;
If I had some more time I think the x.id predicate in the CTE could be improved.
回答3:
WITH RankByIdDocumentAndDataChanged AS
(
SELECT *,
CASE
IdRole - LAG(IdRole) OVER (PARTITION BY IdDocument ORDER BY DateChange)
WHEN 0 THEN 0
ELSE 1
END AS DIFF
FROM @LogTest
)
select *, SUM(DIFF) OVER (PARTITION BY IdDocument ORDER BY DateChange)
from RankByIdDocumentAndDataChanged
ORDER BY Id
来源:https://stackoverflow.com/questions/28146158/how-to-make-row-numbering-with-ordering-partitioning-and-grouping