How to delete duplicate rows in SQL Server?

后端 未结 23 1816
长情又很酷
长情又很酷 2020-11-22 00:58

How can I delete duplicate rows where no unique row id exists?

My table is

col1  col2 col3 col4 col5 col6 col7
john  1          


        
23条回答
  •  温柔的废话
    2020-11-22 01:31

    After trying the suggested solution above, that works for small medium tables. I can suggest that solution for very large tables. since it runs in iterations.

    1. Drop all dependency views of the LargeSourceTable
    2. you can find the dependecies by using sql managment studio, right click on the table and click "View Dependencies"
    3. Rename the table:
    4. sp_rename 'LargeSourceTable', 'LargeSourceTable_Temp'; GO
    5. Create the LargeSourceTable again, but now, add a primary key with all the columns that define the duplications add WITH (IGNORE_DUP_KEY = ON)
    6. For example:

      CREATE TABLE [dbo].[LargeSourceTable] ( ID int IDENTITY(1,1), [CreateDate] DATETIME CONSTRAINT [DF_LargeSourceTable_CreateDate] DEFAULT (getdate()) NOT NULL, [Column1] CHAR (36) NOT NULL, [Column2] NVARCHAR (100) NOT NULL, [Column3] CHAR (36) NOT NULL, PRIMARY KEY (Column1, Column2) WITH (IGNORE_DUP_KEY = ON) ); GO

    7. Create again the views that you dropped in the first place for the new created table

    8. Now, Run the following sql script, you will see the results in 1,000,000 rows per page, you can change the row number per page to see the results more often.

    9. Note, that I set the IDENTITY_INSERT on and off because one the columns contains auto incremental id, which I'm also copying

    SET IDENTITY_INSERT LargeSourceTable ON DECLARE @PageNumber AS INT, @RowspPage AS INT DECLARE @TotalRows AS INT declare @dt varchar(19) SET @PageNumber = 0 SET @RowspPage = 1000000 select @TotalRows = count (*) from LargeSourceTable_TEMP

    While ((@PageNumber - 1) * @RowspPage < @TotalRows )
    Begin
        begin transaction tran_inner
            ; with cte as
            (
                SELECT * FROM LargeSourceTable_TEMP ORDER BY ID
                OFFSET ((@PageNumber) * @RowspPage) ROWS
                FETCH NEXT @RowspPage ROWS ONLY
            )
    
            INSERT INTO LargeSourceTable 
            (
                 ID                     
                ,[CreateDate]       
                ,[Column1]   
                ,[Column2] 
                ,[Column3]       
            )       
            select 
                 ID                     
                ,[CreateDate]       
                ,[Column1]   
                ,[Column2] 
                ,[Column3]       
            from cte
    
        commit transaction tran_inner
    
        PRINT 'Page: ' + convert(varchar(10), @PageNumber)
        PRINT 'Transfered: ' + convert(varchar(20), @PageNumber * @RowspPage)
        PRINT 'Of: ' + convert(varchar(20), @TotalRows)
    
        SELECT @dt = convert(varchar(19), getdate(), 121)
        RAISERROR('Inserted on: %s', 0, 1, @dt) WITH NOWAIT
        SET @PageNumber = @PageNumber + 1
    End
    

    SET IDENTITY_INSERT LargeSourceTable OFF

提交回复
热议问题