SQL Server 2008: delete duplicate rows

后端 未结 6 900
北荒
北荒 2020-12-13 11:29

I have duplicate rows in my table, how can I delete them based on a single column\'s value?

Eg

uniqueid, col2, col3 ...
1, john, simpson
2, sally, ro         


        
6条回答
  •  情歌与酒
    2020-12-13 11:54

    You can DELETE from a cte:

    WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank'
                 FROM Table)
    DELETE FROM cte 
    WHERE RowRank > 1
    

    The ROW_NUMBER() function assigns a number to each row. PARTITION BY is used to start the numbering over for each item in that group, in this case each value of uniqueid will start numbering at 1 and go up from there. ORDER BY determines which order the numbers go in. Since each uniqueid gets numbered starting at 1, any record with a ROW_NUMBER() greater than 1 has a duplicate uniqueid

    To get an understanding of how the ROW_NUMBER() function works, just try it out:

    SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank'
    FROM Table
    ORDER BY uniqueid
    

    You can adjust the logic of the ROW_NUMBER() function to adjust which record you'll keep or remove.

    For instance, perhaps you'd like to do this in multiple steps, first deleting records with the same last name but different first names, you could add last name to the PARTITION BY:

    WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid, col3 ORDER BY col2)'RowRank'
                 FROM Table)
    DELETE FROM cte 
    WHERE RowRank > 1
    

提交回复
热议问题