Cannot access S3 bucket with Hadoop

匿名 (未验证) 提交于 2019-12-03 10:24:21

问题:

I'm trying to access my S3 bucket with Hadoop (2.7.3) and I'm getting the following

ubuntu@AWS:~/Prototype/hadoop$ ubuntu@AWS:~/Prototype/hadoop$ bin/hadoop fs -ls s3://[bucket]/

17/03/24 15:33:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable -ls: Fatal internal error com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3, AWS Request ID: 1FA2318A386330C0, AWS Error Code: null, AWS Error Message: Bad Request, S3 Extended Request ID: 1S7Eq6s9YxUb9bPwyHP73clJvD619LZ2o0jE8VklMAA9jrKXPbvT7CG6nh0zeuluGrzybiPbgRQ= at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798) at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528) at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1031) at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:994) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:297) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325) at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235) at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218) at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201) at org.apache.hadoop.fs.shell.Command.run(Command.java:165) at org.apache.hadoop.fs.FsShell.run(FsShell.java:287) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.fs.FsShell.main(FsShell.java:340) ubuntu@AWS:~/Prototype/hadoop$

conf-site.xml:

<configuration>     <property>         <name>fs.defaultFS</name>         <value>s3://[ Bucket ]</value>     </property>      <property>             <name>fs.s3a.endpoint</name>             <value>s3.eu-central-1.amazonaws.com</value>     </property>      <property>         <name>fs.s3a.access.key</name>         <value>[ Access Key Id ]</value>     </property>      <property>         <name>fs.s3a.secret.key</name>         <value>[ Secret Access Key ]</value>     </property>      <property>         <name>fs.s3.awsAccessKeyId</name>         <value>[ Access Key Id ]</value>     </property>      <property>         <name>fs.s3.awsSecretAccessKey</name>         <value>[ Secret Access Key ]</value>     </property>      <property>         <name>fs.s3n.awsAccessKeyId</name>         <value>[ Access Key Id ]</value>     </property>      <property>         <name>fs.s3n.awsSecretAccessKey</name>         <value>[ Secret Access Key ]</value>     </property>      <property>         <name>fs.s3.impl</name>         <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>     </property>      <!-- Comma separated list of local directories used to buffer          large results prior to transmitting them to S3. -->     <property>         <name>fs.s3.buffer.dir</name>         <value>/tmp</value>     </property> </configuration> 

Anyone knows what's the issue?

Edit: The bucket and the VMs accessing it are in Frankfurt. It seemed similar to https://docs.hortonworks.com/HDPDocuments/HDCloudAWS/HDCloudAWS-1.8.0/bk_hdcloud-aws/content/s3-trouble/index.html but after adding the endpoint still it doesn't work.

回答1:

Sounds like the V4 auth problem, which the fs.s3a.endpoint property should have fixed that

Clock problems can cause issues too. Check Joda time, and make sure that all your machines have caught up with this weekend's clock change.

Try also grabbing the Hadoop 2.8.0 RC3 and see if the problem has gone away then. If it is still there, that's the version to ask for help with on the apache lists.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!