Upsert / merge tables in SQLite

风格不统一 提交于 2020-03-04 18:58:52

问题


I have created a database using sqlite3 in python that has thousands of tables. Each of these tables contains thousands of rows and ten columns. One of the columns is the date and time of an event: it is a string that is formatted as YYYY-mm-dd HH:MM:SS, which I have defined to be the primary key for each table. Every so often, I collect some new data (hundreds of rows) for each of these tables. Each new dataset is pulled from a server and loaded in directly as a pandas data frame or is stored as a CSV file. The new data contains the same ten columns as my original data. I need to update the tables in my database using this new data in the following way:

  1. Given a table in my database, for each row in the new dataset, if the date and time of the row matches the date and time of an existing row in my database, update the remaining columns of that row using the values in the new dataset.
  2. If the date and time does not yet exist, create a new row and insert it to my database.

Below are my questions:

  1. I've done some searching on Google and it looks like I should be using the UPSERT (merge) functionality of sqlite but I can't seem to find any examples showing how to use it. Is there an actual UPSERT command, and if so, could someone please provide an example (preferably with sqlite3 in Python) or point me to a helpful resource?
  2. Also, is there a way to do this in bulk so that I can UPSERT each new dataset into my database without having to go row by row? (I found this link, which suggests that it is possible, but I'm new to using databases and am not sure how to actually run the UPSERT command.)
  3. Can UPSERT also be performed directly using pandas.DataFrame.to_sql?

My backup solution is loading in the table to be UPSERTed using pd.read_sql_query("SELECT * from table", con), performing pandas.DataFrame.merge, deleting the said table from the database, and then adding in the updated table to the database using pd.DataFrame.to_sql (but this would be inefficient).


回答1:


First, even though the questions are related, ask them separately in the future.

  1. There is documentation on UPSERT handling in SQLite that documents how to use it but it is a bit abstract. You can check examples and discussion here: SQLite - UPSERT *not* INSERT or REPLACE

  2. Use a transaction and the statements are going to be executed in bulk.

  3. As presence of this library suggests to_sql does not create UPSERT commands (only INSERT).




回答2:


Instead of going through upsert command, why don't you create your own algorithim that will find values and replace them if date & time is found, else it will insert new row. Check out my code, i wrote for you. Let me know if you are still confused. You can even do that for hundereds of tables just by replacing table name in algorithim with some variable and changing it for the whole list of your table names.

import sqlite3
import pandas as pd

csv_data = pd.read_csv("my_CSV_file.csv")           # Your CSV Data Path

def manual_upsert():                          
    con = sqlite3.connect(connection_str)
    cur = con.cursor()
    cur.execute("SELECT * FROM my_CSV_data")        # Viewing Data from Column
    data = cur.fetchall()

    old_data_list = []                          # Collection of All Dates already in Database table.
    for line in data:
        old_data_list.append(line[0])           # I suppose you Date Column is on 0 Index.


    for new_data in csv_data:
        if new_data[0] in old_data_list:
            cur.execute("UPDATE my_CSV_data SET column1=?, column2=?, column3=? WHERE my_date_column=?",        # it will update column based on date if condition is true
                                                (new_data[1],new_data[2],new_data[3],new_data[0]))
        else:
            cur.execute("INSERT INTO my_CSV_data VALUES(?,?,?,?)",                                              # It will insert new row if date is not found.
                                                (new_data[0],new_data[1],new_data[2],new_data[3]))                 
    con.commit()
    con.close()


manual_upsert()


来源:https://stackoverflow.com/questions/60220449/upsert-merge-tables-in-sqlite

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!