问题
My project has to deal with huge database. At worst situation, it can be more than 80 millions row.
Now, I have 2 tables T1
and T2
. I have to copy data from table T1
to table T2
- if a row in table
T1
already exists in tableT2
(same primary key), then update data of other columns of the row inT1
toT2
- else insert new row into
T2
At first, I use while loop to loop through 80 millions row in T1
then update or insert to T2
. This is very very very slow, it takes more than 10 hours to finish. But, if any row causes an error, I can ignore it and also catch the error.
After that, I use a query like:
update Table2
set T2.Column1 = T1.Column1,T2.Column2=T1.Column2
from Table2 T2 JOIN Table1 T1 ON T1.ID=T2.ID
This is much faster, only take about 1->2 hours to finish. But, if any row has an error, the query cannot execute at all.
So, my question is:
Is there any way that above query can ignore error row and continue execute with valid row?
If there is no way I can do that, what can I do to run faster than the first method and also can catch error row?
p/s: I have try to split the table to multiple small part then update or insert all small part in the same time, but it didn't faster at all.
I have solved the problem with my second method. I use TRY_CAST to prevent exception when insert or update row. Any data that invalid will be NULL. After finish, I compare 2 table and find different row. These row is error row.
回答1:
You can try deleting existing rows from T2 and then Bulk Insert all rows from T1. It depends on the number of existing rows, if it is too large then this approach wouldn't work.
回答2:
As for functionality you are asking for I would suggest the following:
MERGE INTO table2 target
USING
(
SELECT id, column1, column2 FROM table1
) source ([id], [column1], [column2])
ON target.[Id] = source.[Id]
WHEN MATCHED THEN
UPDATE SET
target.Colum1 = source.Column1,
target.COlumn2 = source.Column2
WHEN NOT MATCHED BY SOURCE THEN
DELETE
WHEN NOT MATCHED BY TARGET THEN
INSERT ([Id], [Column1], [Column2])
VALUES([Id], [Column1], [Column2])
;
As for ignoring errors - I see this way as a wrong one. in this regards I would invest some effort in data validation
回答3:
I have solved the problem with my second method. I use TRY_CAST to prevent exception when insert or update row. Any data that invalid will be NULL. After finish, I compare 2 table and find different row. These row is error row.
来源:https://stackoverflow.com/questions/17188713/ignore-error-row-when-update-or-insert-sql-server