I have listed duplicates using the following:
select s.MessageId, t.*
from Message s
join (
select ToUserId, FromUserId, count(*) as qty
from Message
group by ToUserId, FromUserId
having count(*) > 1
) t on s.ToUserId = t.ToUserId and s.FromUserId = t.FromUserId
Now, how do I delete all but one of the Messages (I'm trying to remove duplicates so I can apply a unique index on FromUserId and ToUserId
).
Vamsi Prabhala
Use a cte
and assign row numbers so that all but one for duplicate pairs can be deleted.
with rownums as
(select m.*,
row_number() over(partition by ToUserId, FromUserId order by ToUserId, FromUserId) as rnum
from Message m)
delete r
from rownums r
where rnum > 1
Make sample data
DECLARE @Message TABLE(ID INT ,ToUserId varchar(100),FromUserId varchar(100))
INSERT INTO @Message(ID,ToUserId, FromUserId )
VALUES ( 1,'abc', 'def' ), ( 2,'abc', 'def' ), ( 3,'abc', 'def' ), ( 4,'qaz', 'xsw' )
--Do delete data
DELETE m FROM @Message AS m
INNER JOIN (
SELECT *,row_number()OVER(PARTITION BY ToUserId,FromUserId ORDER BY ID ) AS rn FROM @Message AS m
) t ON t.ID=m.ID
WHERE t.rn>1
SELECT * FROM @Message
----------- ---------- ---------- 1 abc def 4 qaz xsw
If there is no column to indicate specify line as ID, you can try to use address of line (for example %%lockres%%)
DELETE m FROM @Message AS m
INNER JOIN (
SELECT *,row_number()OVER(PARTITION BY ToUserId,FromUserId ORDER BY %%lockres%% ) AS rn FROM @Message AS m
) t ON t.ID=m.ID
WHERE t.rn>1
SELECT *, %%lockres%% FROM @Message
来源:https://stackoverflow.com/questions/41435527/deleting-duplicates-based-on-multiple-columns