How to write data to Redshift that is a result of a dataframe created in Python?

后端 未结 6 857
谎友^
谎友^ 2020-12-14 08:27

I have a dataframe in Python. Can I write this data to Redshift as a new table? I have successfully created a db connection to Redshift and am able to execute simple sql que

6条回答
  •  难免孤独
    2020-12-14 09:03

    Assuming you have access to S3, this approach should work:

    Step 1: Write the DataFrame as a csv to S3 (I use AWS SDK boto3 for this)
    Step 2: You know the columns, datatypes, and key/index for your Redshift table from your DataFrame, so you should be able to generate a create table script and push it to Redshift to create an empty table
    Step 3: Send a copy command from your Python environment to Redshift to copy data from S3 into the empty table created in step 2

    Works like a charm everytime.

    Step 4: Before your cloud storage folks start yelling at you delete the csv from S3

    If you see yourself doing this several times, wrapping all four steps in a function keeps it tidy.

提交回复
热议问题