We previously used to copy files from s3 to Redshift using the COPY command every day, from a bucket with no specific policy.
COPY schema.table_staging FROM 's3://our-bucket/X/YYYY/MM/DD/' CREDENTIALS 'aws_access_key_id=xxxxxx;aws_secret_access_key=xxxxxx' CSV GZIP DELIMITER AS '|' TIMEFORMAT 'YYYY-MM-DD HH24:MI:SS';
As we needed to improve the security of our S3 bucket, we added a policy to authorize connections either from our VPC (the one we use for our Redshift cluster) or specific IP address.
{ "Version": "2012-10-17", "Id": "S3PolicyId1", "Statement": [ { "Sid": "DenyAllExcept", "Effect": "Deny", "Principal": "*", "Action": "s3:*", "Resource": [ "arn:aws:s3:::our-bucket/*", "arn:aws:s3:::our-bucket" ], "Condition": { "StringNotEqualsIfExists": { "aws:SourceVpc": "vpc-123456789" }, "NotIpAddressIfExists": { "aws:SourceIp": [ "12.35.56.78/32" ] } } } ] }
This policy works well for accessing files from EC2, EMR or our specific address using AWS CLI or the boto Python library.
Here is the error we have on Redshift :
Many thanks in advance if you can help us on this,
Damien
ps : this question is quite similar to this one : Copying data from S3 to Redshift - Access denied