I have a data frame that I want to write to a Postgres database. This functionality needs to be part of a Flask app.
For now, I\'m runn
You can use those connections and avoid SQLAlchemy. This is going to sound rather unintuitive, but it will be much faster than regular inserts (even if you were to drop the ORM and make a general query e.g. with executemany
). Inserts are slow, even with raw queries, but you'll see that COPY
is mentioned several times in How to speed up insertion performance in PostgreSQL. In this instance, my motivations for the approach below are:
COPY
instead of INSERT
Suggested approach using cursor.copy_from():
import csv
import io
import psycopg2
df = "<your_df_here>"
# drop all the columns you don't want in the insert data here
# First take the headers
headers = df.columns
# Now get a nested list of values
data = df.values.tolist()
# Create an in-memory CSV file
string_buffer = io.StringIO()
csv_writer = csv.writer(string_buffer)
csv_writer.writerows(data)
# Reset the buffer back to the first line
string_buffer.seek(0)
# Open a connection to the db (which I think you already have available)
with psycopg2.connect(dbname=current_app.config['POSTGRES_DB'],
user=current_app.config['POSTGRES_USER'],
password=current_app.config['POSTGRES_PW'],
host=current_app.config['POSTGRES_URL']) as conn:
c = conn.cursor()
# Now upload the data as though it was a file
c.copy_from(string_buffer, 'the_table_name', sep=',', columns=headers)
conn.commit()
This should be orders of magnitude faster than actually doing inserts.