How to write data to Redshift that is a result of a dataframe created in Python?

后端未结

关注

 6  857

谎友^ 2020-12-14 08:27

I have a dataframe in Python. Can I write this data to Redshift as a new table? I have successfully created a db connection to Redshift and am able to execute simple sql que

6条回答

难免孤独 (楼主)

2020-12-14 09:03

Assuming you have access to S3, this approach should work:
Step 1: Write the DataFrame as a csv to S3 (I use AWS SDK boto3 for this)
Step 2: You know the columns, datatypes, and key/index for your Redshift table from your DataFrame, so you should be able to generate a create table script and push it to Redshift to create an empty table
Step 3: Send a copy command from your Python environment to Redshift to copy data from S3 into the empty table created in step 2

Works like a charm everytime.
Step 4: Before your cloud storage folks start yelling at you delete the csv from S3
If you see yourself doing this several times, wrapping all four steps in a function keeps it tidy.

0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...