问题
I have installed SparkR with R (and Rstudio) on EC2. I'm trying to read files located on s3:
temp <- textFile(sc, "s3://dev.xxxx.com/txttest")
and get:
java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be
specified as the username or password (respectively) of a s3 URL, or by setting the
fs.s3.awsAccessKeyId or fs.s3.awsSecretAccessKey properties (respectively).`
I've tried to add my access key + secret like so:
temp <- textFile(sc, "s3:{access_key:secret_key}@dev.xxxx.com/txttest")
and got:
Invalid hostname in URI s3://11111111111111111111:2222222222222222222222222222222222222222@dev.xxx.com
at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:41)`
I also tried to use
export AWS_SECRET_ACCESS_KEY=2222222222222222222222222222222222222222
export AWS_ACCESS_KEY_ID=11111111111111111111`
before launching the cluster but to no avail.
Questions:
1. How can I change the fs.s3.awsAccessKeyId and fs.s3.awsSecretAccessKey properties?
2. Is there a correct syntax I'm missing in the URI?
Any help would be greatly appreciated.
来源:https://stackoverflow.com/questions/29898880/sparkr-on-rstudio-cannot-access-s3