How to Find Rows which are Duplicates by a Key but Not Duplicates in All Columns?

后端未结

关注

 5  2068

逝去的感伤 2020-12-30 05:31

I am working with a table which is an extract of a set of other tables. All of the rows of the extract table should be unique according to keys D1, D2 and D3. They are not.

5条回答

天命终不由人 (楼主)

2020-12-30 06:13

I haven't had a chance to try Conrad's answer yet, but came up with one of my own. It's rather a "duh" moment.

So, if you want to find all the rows in set A except for those that are in set B, you use the EXCEPT operator:

; 
WITH KEYDUPLICATES(D1,D2,D3) AS 
( 
    SELECT D1, D2, D3 
    FROM SOURCE 
    GROUP BY D1, D2, D3 
    HAVING COUNT(*)>1 
),
KEYDUPLICATEROWS AS
( 
    SELECT S.D1, S.D2, S.D3, S.C4, S.C5, S.C6 
    FROM SOURCE S 
    INNER JOIN KEYDUPLICATES D 
        ON S.D1 = D.D1 AND S.D2 = D.D2 AND S.D3 = D.D3 
),
FULLDUPLICATES AS
(
    SELECT S.D1, S.D2, S.D3, S.C4, S.C5, S.C6 
    FROM SOURCE S
    GROUP BY S.D1, S.D2, S.D3, S.C4, S.C5, S.C6 
    HAVING COUNT(*)>1
)
SELECT KR.D1, KR.D2, KR.D3, KR.C4, KR.C5, KR.C6
FROM KEYDUPLICATEROWS AS KR
EXCEPT
SELECT FD.D1, FD.D2, FD.D3, FD.C4, FD.C5, FD.C6
FROM FULLDUPLICATES AS FD
ORDER BY D1, D2, D3, C4, C5, C6

This seems to be showing me 1500 rows which are duplicates across (D1,D2,D3), but which are only duplicates across a subset of (D1,D2,D3,C4,C5,C6). In fact, it appears they are duplicates across (D1,D2,D3,C4,C5).

How to confirm that will be the subject of another question.

0 讨论(0)

查看其它5个回答