问题
I am currently executing the simply query below with python using pyodbc to insert data in SQL server table:
import pyodbc
table_name = 'my_table'
insert_values = [(1,2,3),(2,2,4),(3,4,5)]
cnxn = pyodbc.connect(...)
cursor = cnxn.cursor()
cursor.execute(
' '.join([
'insert into',
table_name,
'values',
','.join(
[str(i) for i in insert_values]
)
])
)
cursor.commit()
This should work as long as there are no duplicate keys (let's assume the first column contains the key). However for data with duplicate keys (data already existing in the table) it will raise an error. How can I, in one go, insert multiple rows in a SQL server table using pyodbc such that data with duplicate keys simply gets updated.
Note: There are solutions proposed for single rows of data, however, I would like to insert multiple rows at once (avoid loops)!
回答1:
This can be done using MERGE
. Let's say you have a key column ID
, and two columns col_a
and col_b
(you need to specify column names in update statements), then the statement would look like this:
MERGE INTO MyTable as Target
USING (SELECT * FROM
(VALUES (1, 2, 3), (2, 2, 4), (3, 4, 5))
AS s (ID, col_a, col_b)
) AS Source
ON Target.ID=Source.ID
WHEN NOT MATCHED THEN
INSERT (ID, col_a, col_b) VALUES (Source.ID, Source.col_a, Source.col_b)
WHEN MATCHED THEN
UPDATE SET col_a=Source.col_a, col_b=Source.col_b;
You can give it a try on rextester.com/IONFW62765.
Basically, I'm creating a Source
table "on-the-fly" using the list of values, which you want to upsert. When you then merge the Source
table with the Target
, you can test the MATCHED
condition (Target.ID=Source.ID
) on each row (whereas you would be limited to a single row when just using a simple IF <exists> INSERT (...) ELSE UPDATE (...)
condition).
In python with pyodbc
, it should probably look like this:
import pyodbc
insert_values = [(1, 2, 3), (2, 2, 4), (3, 4, 5)]
table_name = 'my_table'
key_col = 'ID'
col_a = 'col_a'
col_b = 'col_b'
cnxn = pyodbc.connect(...)
cursor = cnxn.cursor()
cursor.execute(('MERGE INTO {table_name} as Target '
'USING (SELECT * FROM '
'(VALUES {vals}) '
'AS s ({k}, {a}, {b}) '
') AS Source '
'ON Target.ID=Source.ID '
'WHEN NOT MATCHED THEN '
'INSERT ({k}, {a}, {b}) VALUES (Source.{k}, Source.{a}, Source.{b}) '
'WHEN MATCHED THEN '
'UPDATE SET {k}=Source.{a}, col_b=Source.{b};'
.format(table_name=table_name,
vals=','.join([str(i) for i in insert_values]),
k=key_col,
a=col_a,
b=col_b)))
cursor.commit()
You can read up more on MERGE
in the SQL Server docs.
回答2:
Given a dataframe(df) I used the code from ksbg to upsert into a table. Note that I looked for a match on two columns (date and stationcode) you can use one. Code generates the query given any df.
def append(df, c):
table_name = 'ddf.ddf_actuals'
columns_list = df.columns.tolist()
columns_list_query = f'({(",".join(columns_list))})'
sr_columns_list = [f'Source.{i}' for i in columns_list]
sr_columns_list_query = f'({(",".join(sr_columns_list))})'
up_columns_list = [f'{i}=Source.{i}' for i in columns_list]
up_columns_list_query = f'{",".join(up_columns_list)}'
rows_to_insert = [row.tolist() for idx, row in final_list.iterrows()]
rows_to_insert = str(rows_to_insert).replace('[', '(').replace(']', ')')[1:][:-1]
query = f"MERGE INTO {table_name} as Target \
USING (SELECT * FROM \
(VALUES {rows_to_insert}) \
AS s {columns_list_query}\
) AS Source \
ON Target.stationcode=Source.stationcode AND Target.date=Source.date \
WHEN NOT MATCHED THEN \
INSERT {columns_list_query} VALUES {sr_columns_list_query} \
WHEN MATCHED THEN \
UPDATE SET {up_columns_list_query};"
c.execute(query)
c.commit()
来源:https://stackoverflow.com/questions/50135898/multi-row-upsert-insert-or-update-from-python