SQL Server Weird Grouping Scenario by multiple columns and OR

冷暖自知 提交于 2019-12-21 09:18:07

问题


I have a weird grouping scenario and have some troubles finding out what would be the best way for grouping in SQL.

Imagine we have the following one table

CREATE TABLE Item
(
  KeyId VARCHAR(1) NOT NULL,
  Col1 INT NULL,
  Col2 INT NULL,
  Col3 INT NULL
)

GO

INSERT INTO Item (KeyId, Col1, Col2, Col3)
VALUES 
('a',1,2,3),
('b',5,4,3),
('c',5,7,6),
('d',8,7,9),
('e',11,10,9),
('f',11,12,13),
('g',20,22,21),
('h',23,22,24)

I need to group records in this table so that if Col1 OR Col2 OR Col3 is the same for two records, then these two records should be in the same group, and there should be chaining. In other words, with the data as above record 'a' (first record) has Col3 = 3 and record 'b' (second record) has also Col3 = 3, so these two should be in one group. But then record 'b' has the same Col1 as record 'c', so record 'c' should be in the same group as 'a' and 'b'. And then record 'd' has the same Col2 as in 'c', so this should also be in the same group. Similarly 'e' and 'f' has the same values in Col3 and Col1 respectively.

On the other hand records 'g' and 'h' will be in one group (because they have the same Col2 = 22), but this group will be different from the group for records 'a','b','c','d','e','f'.

The result of the query should be something like

KeyId GroupId
'a'   1 
'b'   1
'c'   1
'd'   1
'e'   1
'f'   1
'g'   2
'h'   2

There is probably a way of doing this with some loops/cursors, but I started thinking about cleaner way and this seems quite difficult.


回答1:


Here you go:

with g (rootid, previd, level, keyid, col1, col2, col3) as (
  select keyid, '-', 1, keyid, col1, col2, col3 from item
  union all
  select g.rootid, g.keyid, g.level + 1, i.keyid, i.col1, i.col2, i.col3 
    from g
    join item i on i.col1 = g.col1 or i.col2 = g.col2 or i.col3 = g.col3 
    where i.keyid > g.keyid
),
  m (keyid, rootid) as (
  select keyid, min(rootid) from g group by keyid
)
select * from m;

Result:

keyid  rootid  
-----  ------
a      a       
b      a       
c      a       
d      a       
e      a       
f      a       
g      g       
h      g       

Note: Keep in mind that SQL Server has by default a limit of 100 iterations (number of rows per group) when processing recursive CTEs. In English: even though it's possible to do this as shown above, there are clear limitations to what SQL Server can process. If you reach this limit you'll get the message:

The maximum recursion 100 has been exhausted before statement completion.

If this happens consider adding the clause option (maxrecursion 32767).



来源:https://stackoverflow.com/questions/51632251/sql-server-weird-grouping-scenario-by-multiple-columns-and-or

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!