Copying data from S3 to Redshift hangs

久未见 提交于 2020-07-05 12:56:28

问题


I've been trying to load data into Redshift for the last couple of days with no success. I have provided the correct IAM role to the cluster, I have given access to S3, I am using the COPY command with either the AWS credentials or the IAM role and so far no success. What can be the reason for this? It has come to the point that I don't have many options left.

So the code is pretty basic, nothing fancy there. See below:

copy test_schema.test from 's3://company.test/tmp/append.csv.gz' 
iam_role 'arn:aws:iam::<rolenumber>/RedshiftCopyUnload'
delimiter ',' gzip;

I didn't put any error messages because there are none. The code simply hangs and I have left it running for well over 40 minutes with no results. If I go into the Queries section in Redshift I dont see any abnormal. I am using Aginity and SQL Workbench to run the queries.

I also tried to manually insert queries in Redshift and seems that works. COPY and UNLOAD do not work and even though I have created Roles with access to S3 and associated with the cluster I still get this problem.

Thoughts?

EDIT: Solution has been found. Basically it was a connectivity problem within our VPC. A VPC endpoint had to be created and associated with the subnet used by Redshift.


回答1:


I agree with JohnRotenstein that, there needs more information to provide the answer. I would suggest you to take simple data points and simple table. Here are step-by-step solution, I hope by doing that, you should be able to resolve your issue.

Assume here is your table structure.

Here I'm doing most of data types to prove my point. create table sales( salesid integer, commission decimal(8,2), saledate date, description varchar(255), created_at timestamp default sysdate, updated_at timestamp);

Just to make it simple, here is your data file resides in S3.
Content in CSV(sales-example.txt)

salesid,commission,saledate,description,created_at,updated_at
1|3.55|2018-12-10|Test description|2018-05-17 23:54:51|2018-05-17 23:54:51
2|6.55|2018-01-01|Test description|2018-05-17 23:54:51|2018-05-17 23:54:51
4|7.55|2018-02-10|Test description|2018-05-17 23:54:51|2018-05-17 23:54:51
5|3.55||Test description|2018-05-17 23:54:51|2018-05-17 23:54:51
7|3.50|2018-10-10|Test description|2018-05-17 23:54:51|2018-05-17 23:54:51

Run following two command using the psql terminal or any sql connector. Make sure to run second command as well.

copy sales(salesid,commission,saledate,description,created_at,updated_at) from 's3://example-bucket/foo/bar/sales-example.txt' credentials 'aws_access_key_id=************;aws_secret_access_key=***********' IGNOREHEADER  1;

commit;

I hope, this should help you in debugging your issue.



来源:https://stackoverflow.com/questions/53403088/copying-data-from-s3-to-redshift-hangs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!