问题
I have a Python dataframe with NULL value in some rows, while inserting to postgresql, some null in datetype column turns into 'NaT' string or 'NaN', I like it to be a real NULL , which is nothing in that cell.
sample dataframe before insert
import psycopg2
import pandas as pd
import numpy as np
conn=psycopg2.connect(dbname= 'myDB', host='amazonaws.com',
port= '2222', user= 'mysuser', password= 'mypass')
cur = conn.cursor()
df= pd.DataFrame({ 'zipcode':[1,np.nan,22,88],'city':['A','h','B',np.nan]})
subset = df[['zipcode', 'city']]
data = [tuple(x) for x in subset.values]
records_list_template = ','.join(['%s'] * len(data))
insert_query = 'insert into public.MyTable (zipcode, city) values {}'.format(records_list_template)
cur.execute(insert_query, data)
conn.commit()
result in postgresql table
expected result below
回答1:
You can convert NaN
to None
in this way:
df= pd.DataFrame({
'zipcode':[1,np.nan,22,88],
'city':['A','h','B',np.nan],
'date':['2019-01-01','2019-01-02',pd.NaT,pd.NaT]})
df['date'] = [d.strftime('%Y-%m-%d') if not pd.isnull(d) else None for d in df['date']]
subset = df.where((pd.notnull(df)), None)
See DataFrame.where
回答2:
Convert all instances of NaN in the dataframe by replacing with None, like this:
df = df.replace({pd.np.nan: None})
来源:https://stackoverflow.com/questions/54428797/python-psycopg2-insert-null-in-some-rows-in-postgresql-table