EntityTooLarge error when uploading a 5G file to Amazon S3

前端 未结 3 1464
难免孤独
难免孤独 2021-01-02 04:17

Amazon S3 file size limit is supposed to be 5T according to this announcement, but I am getting the following error when uploading a 5G file

\'/mahler%2Fparq         


        
3条回答
  •  谎友^
    谎友^ (楼主)
    2021-01-02 05:11

    The trick usually seems to be figuring out how to tell S3 to do a multipart upload. For copying data from HDFS to S3, this can be done by using the s3n filesystem and specifically enabling multipart uploads with fs.s3n.multipart.uploads.enabled=true

    This can be done like:

    hdfs dfs -Dfs.s3n.awsAccessKeyId=ACCESS_KEY -Dfs.s3n.awsSecretAccessKey=SUPER_SECRET_KEY -Dfs.s3n.multipart.uploads.enabled=true -cp hdfs:///path/to/source/data s3n://bucket/folder/
    

    And further configuration can be found here: https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html

提交回复
热议问题