Copy data from Amazon S3 to Redshift and avoid duplicate rows

前端 未结 4 1073
深忆病人
深忆病人 2021-02-01 10:03

I am copying data from Amazon S3 to Redshift. During this process, I need to avoid the same files being loaded again. I don\'t have any unique constraints on my Redshift table.

4条回答
  •  暗喜
    暗喜 (楼主)
    2021-02-01 10:38

    My solution is to run a 'delete' command before 'copy' on the table. In my use case, each time I need to copy the records of a daily snapshot to redshift table, thus I can use the following 'delete' command to ensure duplicated records are deleted, then run the 'copy' command.

    DELETE from t_data where snapshot_day = 'xxxx-xx-xx';

提交回复
热议问题