I am working with a table which is an extract of a set of other tables. All of the rows of the extract table should be unique according to keys D1, D2 and D3. They are not.
I haven't had a chance to try Conrad's answer yet, but came up with one of my own. It's rather a "duh" moment.
So, if you want to find all the rows in set A except for those that are in set B, you use the EXCEPT operator:
;
WITH KEYDUPLICATES(D1,D2,D3) AS
(
SELECT D1, D2, D3
FROM SOURCE
GROUP BY D1, D2, D3
HAVING COUNT(*)>1
),
KEYDUPLICATEROWS AS
(
SELECT S.D1, S.D2, S.D3, S.C4, S.C5, S.C6
FROM SOURCE S
INNER JOIN KEYDUPLICATES D
ON S.D1 = D.D1 AND S.D2 = D.D2 AND S.D3 = D.D3
),
FULLDUPLICATES AS
(
SELECT S.D1, S.D2, S.D3, S.C4, S.C5, S.C6
FROM SOURCE S
GROUP BY S.D1, S.D2, S.D3, S.C4, S.C5, S.C6
HAVING COUNT(*)>1
)
SELECT KR.D1, KR.D2, KR.D3, KR.C4, KR.C5, KR.C6
FROM KEYDUPLICATEROWS AS KR
EXCEPT
SELECT FD.D1, FD.D2, FD.D3, FD.C4, FD.C5, FD.C6
FROM FULLDUPLICATES AS FD
ORDER BY D1, D2, D3, C4, C5, C6
This seems to be showing me 1500 rows which are duplicates across (D1,D2,D3), but which are only duplicates across a subset of (D1,D2,D3,C4,C5,C6). In fact, it appears they are duplicates across (D1,D2,D3,C4,C5).
How to confirm that will be the subject of another question.