Writing JSON column to Postgres using Pandas .to_sql

本小妞迷上赌 提交于 2019-12-04 01:57:32
peralmq

I've been searching the web for a solution but couldn't find any so here is what we came up with (there might be better ways but at least this is a start if someone else runs into this).

Specify the dtype parameter in to_sql.

We went from:df.to_sql(table_name, analytics_db) to df.to_sql(table_name, analytics_db, dtype={'name_of_json_column_in_source_table': sqlalchemy.types.JSON}) and it just works.

If you (re-)create the JSON column using json.dumps(), you're all set. This way the data can be written using pandas' .to_sql() method, but also the much faster COPY method of PostgreSQL (via copy_expert() of psycopg2 or sqlalchemy's raw_connection()) can be employed.

For the sake of simplicity, let's assume that we have a column of dictionaries that should be written into a JSON(B) column:

import json
import pandas as pd

df = pd.DataFrame([['row1',{'a':1, 'b':2}],
                   ['row2',{'a':3,'b':4,'c':'some text'}]],
                  columns=['r','kv'])

# conversion function:
def dict2json(dictionary):
    return json.dumps(dictionary, ensure_ascii=False)

# overwrite the dict column with json-strings
df['kv'] = df.kv.map(dict2json)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!