How to update large table with millions of rows in SQL Server?

后端 未结 6 849
面向向阳花
面向向阳花 2020-11-28 10:07

I\'ve an UPDATE statement which can update more than million records. I want to update them in batches of 1000 or 10000. I tried with @@ROWCOUNT bu

6条回答
  •  一生所求
    2020-11-28 10:25

    I want share my experience. A few days ago I have to update 21 million records in table with 76 million records. My colleague suggested the next variant. For example, we have the next table 'Persons':

    Id | FirstName | LastName | Email            | JobTitle
    1  | John      |  Doe     | abc1@abc.com     | Software Developer
    2  | John1     |  Doe1    | abc2@abc.com     | Software Developer
    3  | John2     |  Doe2    | abc3@abc.com     | Web Designer
    

    Task: Update persons to the new Job Title: 'Software Developer' -> 'Web Developer'.

    1. Create Temporary Table 'Persons_SoftwareDeveloper_To_WebDeveloper (Id INT Primary Key)'

    2. Select into temporary table persons which you want to update with the new Job Title:

    INSERT INTO Persons_SoftwareDeveloper_To_WebDeveloper SELECT Id FROM
    Persons WITH(NOLOCK) --avoid lock 
    WHERE JobTitle = 'Software Developer' 
    OPTION(MAXDOP 1) -- use only one core
    

    Depends on rows count, this statement will take some time to fill your temporary table, but it would avoid locks. In my situation it took about 5 minutes (21 million rows).

    3. The main idea is to generate micro sql statements to update database. So, let's print them:

    DECLARE @i INT, @pagesize INT, @totalPersons INT
        SET @i=0
        SET @pagesize=2000
        SELECT @totalPersons = MAX(Id) FROM Persons
    
        while @i<= @totalPersons
        begin
        Print '
        UPDATE persons 
          SET persons.JobTitle = ''ASP.NET Developer''
          FROM  Persons_SoftwareDeveloper_To_WebDeveloper tmp
          JOIN Persons persons ON tmp.Id = persons.Id
          where persons.Id between '+cast(@i as varchar(20)) +' and '+cast(@i+@pagesize as varchar(20)) +' 
            PRINT ''Page ' + cast((@i / @pageSize) as varchar(20))  + ' of ' + cast(@totalPersons/@pageSize as varchar(20))+'
         GO
         '
         set @i=@i+@pagesize
        end
    

    After executing this script you will receive hundreds of batches which you can execute in one tab of MS SQL Management Studio.

    4. Run printed sql statements and check for locks on table. You always can stop process and play with @pageSize to speed up or speed down updating(don't forget to change @i after you pause script).

    5. Drop Persons_SoftwareDeveloper_To_AspNetDeveloper. Remove temporary table.

    Minor Note: This migration could take a time and new rows with invalid data could be inserted during migration. So, firstly fix places where your rows adds. In my situation I fixed UI, 'Software Developer' -> 'Web Developer'.

提交回复
热议问题