问题
I have a Table named Example that has a over 5M rows.
I want to know the most efficient way to create a new DateTime column that does not allow nulls and has a default value of Now. just setting the value would fail due to the amount of rows.
The plan I have in mind would involve:
1) creating a new column that allows nulls.
ALTER TABLE Example
ADD RecordDate datetime
GO
2) set the value of the column to GETDATE() 1000 (or more if possible) rows at a time.
3) once all rows have a value, I would alter the column to not allow nulls.
ALTER TABLE Example
ALTER COLUMN RecordDate datetime NOT NULL
I am not sure on what would be the most efficient way of completing step number 2. so that is what I would like some tips on.
回答1:
To work though a large table with a sequential ID, applying updates in batches, this approach will work:
DECLARE @startID bigint
DECLARE @endID bigint
SELECT @startID=min(ID) from Example
WHILE @startID IS NOT NULL BEGIN
SELECT @endID=MAX(ID) FROM (
SELECT top(1000) ID from Example where ID>=@startID ORDER BY ID
) t
update Example
set RecordDate = GETDATE()
where ID between @startID and @endID AND RecordDate IS NULL
IF @@ROWCOUNT=0 BEGIN
SET @startID=NULL
END ELSE BEGIN
SET @startID=@endID
END
END
The batch size is controlled by
SELECT top(1000) ID from Example where ID>=@startID ORDER BY ID
Adjust the 1000 as necessary to ensure each UPDATE completes quickly. I've used this technique to update hundreds of millions of rows in batches of around 100000 per update.
回答2:
I would suggest:
ALTER TABLE Example ADD COLUMN RecordDate datetime NOT NULL DEFAULT getdate();
No matter how you attempt this, you are going to have to rewrite all the data records to add the extra bytes on each page for the value -- even if the value is NULL.
I had a thought that the following would minimize changes to the data:
ALTER TABLE Example ADD _RecordDate datetime;
ALTER TABLE Example ADD FirstDateTime as (cast(<current datetime> as datetime));
ALTER TABLE Example ADD RecordDate as COALESCE(_RecordDate, FirstDateTime);
On second thought, you still have to reserve the space on the page for _RecordDate, so the first method is probably the best.
Another alternative would be to set up another table with the same primary key and the record date. This requires a left join to get the information, but that would only be needed when you are accessing the column.
回答3:
mmilan
You can try this script on step #1:
ALTER TABLE Example ADD RecordDate DATETIME NOT NULL DEFAULT GETDATE()
Now you don't need the step #2 and #3.
来源:https://stackoverflow.com/questions/28347233/set-value-to-a-new-datetime-column-in-a-table-with-over-5-million-rows