Redshift COPY operation doesn't work in SQLAlchemy

后端未结

关注

 3  766

隐瞒了意图╮ 2021-01-17 19:18

I\'m trying to do a Redshift COPY in SQLAlchemy.

The following SQL correctly copies objects from my S3 bucket into my Redshift table when I execute it in psql:

3条回答

独厮守ぢ (楼主)

2021-01-17 20:00

I have had success using the core expression language and Connection.execute() (as opposed to the ORM and sessions) to copy delimited files to Redshift with the code below. Perhaps you could adapt it for JSON.

def copy_s3_to_redshift(conn, s3path, table, aws_access_key, aws_secret_key, delim='\t', uncompress='auto', ignoreheader=None):
    """Copy a TSV file from S3 into redshift.

    Note the CSV option is not used, so quotes and escapes are ignored.  Empty fields are loaded as null.
    Does not commit a transaction.
    :param Connection conn: SQLAlchemy Connection
    :param str uncompress: None, 'gzip', 'lzop', or 'auto' to autodetect from `s3path` extension.
    :param int ignoreheader: Ignore this many initial rows.
    :return: Whatever a copy command returns.
    """
    if uncompress == 'auto':
        uncompress = 'gzip' if s3path.endswith('.gz') else 'lzop' if s3path.endswith('.lzo') else None

    copy = text("""
        copy "{table}"
        from :s3path
        credentials 'aws_access_key_id={aws_access_key};aws_secret_access_key={aws_secret_key}'
        delimiter :delim
        emptyasnull
        ignoreheader :ignoreheader
        compupdate on
        comprows 1000000
        {uncompress};
        """.format(uncompress=uncompress or '', table=text(table), aws_access_key=aws_access_key, aws_secret_key=aws_secret_key))    # copy command doesn't like table name or keys single-quoted
    return conn.execute(copy, s3path=s3path, delim=delim, ignoreheader=ignoreheader or 0)

0 讨论(0)

查看其它3个回答