Deleting duplicates based on multiple columns

时光怂恿深爱的人放手 提交于 2019-12-06 03:05:48

问题


I have listed duplicates using the following:

select s.MessageId, t.* 
from Message s
join (
    select ToUserId, FromUserId, count(*) as qty
    from Message
    group by ToUserId, FromUserId
    having count(*) > 1
) t on s.ToUserId = t.ToUserId and s.FromUserId = t.FromUserId

Now, how do I delete all but one of the Messages (I'm trying to remove duplicates so I can apply a unique index on FromUserId and ToUserId).


回答1:


Use a cte and assign row numbers so that all but one for duplicate pairs can be deleted.

with rownums as 
(select m.*, 
 row_number() over(partition by ToUserId, FromUserId order by ToUserId, FromUserId) as rnum
 from Message m)
delete r
from rownums r
where rnum > 1



回答2:


Make sample data

    DECLARE @Message TABLE(ID INT ,ToUserId varchar(100),FromUserId varchar(100))
    INSERT INTO @Message(ID,ToUserId, FromUserId )
    VALUES  ( 1,'abc',  'def'  ), ( 2,'abc',  'def'  ), ( 3,'abc',  'def'  ), ( 4,'qaz',  'xsw'  )

--Do delete data

    DELETE m FROM @Message AS m 
    INNER JOIN (
         SELECT *,row_number()OVER(PARTITION BY ToUserId,FromUserId ORDER BY ID ) AS rn FROM @Message AS m
    ) t ON t.ID=m.ID
    WHERE t.rn>1

    SELECT * FROM @Message
----------- ---------- ----------
1           abc        def
4           qaz        xsw

If there is no column to indicate specify line as ID, you can try to use address of line (for example %%lockres%%)

    DELETE m FROM @Message AS m 
    INNER JOIN (
        SELECT *,row_number()OVER(PARTITION BY ToUserId,FromUserId ORDER BY %%lockres%% ) AS rn FROM @Message AS m
    ) t ON t.ID=m.ID
    WHERE t.rn>1

    SELECT *, %%lockres%% FROM @Message


来源:https://stackoverflow.com/questions/41435527/deleting-duplicates-based-on-multiple-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!