Set value to a new datetime column in a table with over 5 million rows

瘦欲@ 提交于 2019-12-12 03:27:01

问题


I have a Table named Example that has a over 5M rows. I want to know the most efficient way to create a new DateTime column that does not allow nulls and has a default value of Now. just setting the value would fail due to the amount of rows.

The plan I have in mind would involve:

1) creating a new column that allows nulls.

ALTER TABLE Example
ADD RecordDate datetime
GO

2) set the value of the column to GETDATE() 1000 (or more if possible) rows at a time.

3) once all rows have a value, I would alter the column to not allow nulls.

ALTER TABLE Example
ALTER COLUMN RecordDate datetime NOT NULL

I am not sure on what would be the most efficient way of completing step number 2. so that is what I would like some tips on.


回答1:


To work though a large table with a sequential ID, applying updates in batches, this approach will work:

DECLARE @startID bigint
DECLARE @endID bigint

SELECT @startID=min(ID) from Example

WHILE @startID IS NOT NULL BEGIN
  SELECT @endID=MAX(ID) FROM (
    SELECT top(1000) ID from Example where ID>=@startID ORDER BY ID
  ) t

  update Example
  set RecordDate = GETDATE()
  where ID between @startID and @endID AND RecordDate IS NULL

  IF @@ROWCOUNT=0 BEGIN
    SET @startID=NULL
  END ELSE BEGIN
    SET @startID=@endID
  END
END

The batch size is controlled by

SELECT top(1000) ID from Example where ID>=@startID ORDER BY ID

Adjust the 1000 as necessary to ensure each UPDATE completes quickly. I've used this technique to update hundreds of millions of rows in batches of around 100000 per update.




回答2:


I would suggest:

ALTER TABLE Example ADD COLUMN RecordDate datetime NOT NULL DEFAULT getdate();

No matter how you attempt this, you are going to have to rewrite all the data records to add the extra bytes on each page for the value -- even if the value is NULL.

I had a thought that the following would minimize changes to the data:

ALTER TABLE Example ADD _RecordDate datetime;

ALTER TABLE Example ADD FirstDateTime as (cast(<current datetime> as datetime));

ALTER TABLE Example ADD RecordDate as COALESCE(_RecordDate, FirstDateTime);

On second thought, you still have to reserve the space on the page for _RecordDate, so the first method is probably the best.

Another alternative would be to set up another table with the same primary key and the record date. This requires a left join to get the information, but that would only be needed when you are accessing the column.




回答3:


mmilan

You can try this script on step #1:

ALTER TABLE Example ADD RecordDate DATETIME NOT NULL DEFAULT GETDATE()    

Now you don't need the step #2 and #3.



来源:https://stackoverflow.com/questions/28347233/set-value-to-a-new-datetime-column-in-a-table-with-over-5-million-rows

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!