Best way to move files between S3 buckets?

前端 未结 12 2101
暗喜
暗喜 2020-12-07 11:09

I\'d like to copy some files from a production bucket to a development bucket daily.

For example: Copy productionbucket/feed/feedname/date to developmentbucket/feed/

相关标签:
12条回答
  • 2020-12-07 11:11

    To move/copy from one bucket to another or the same bucket I use s3cmd tool and works fine. For instance:

    s3cmd cp --recursive s3://bucket1/directory1 s3://bucket2/directory1
    s3cmd mv --recursive s3://bucket1/directory1 s3://bucket2/directory1
    
    0 讨论(0)
  • 2020-12-07 11:12

    If you have a unix host within AWS, then use s3cmd from s3tools.org. Set up permissions so that your key as read access to your development bucket. Then run:

    s3cmd cp -r s3://productionbucket/feed/feedname/date s3://developmentbucket/feed/feedname
    
    0 讨论(0)
  • 2020-12-07 11:13

    We had this exact problem with our ETL jobs at Snowplow, so we extracted our parallel file-copy code (Ruby, built on top of Fog), into its own Ruby gem, called Sluice:

    https://github.com/snowplow/sluice

    Sluice also handles S3 file delete, move and download; all parallelised and with automatic re-try if an operation fails (which it does surprisingly often). I hope it's useful!

    0 讨论(0)
  • 2020-12-07 11:14

    For me the following command just worked:

    aws s3 mv s3://bucket/data s3://bucket/old_data --recursive
    
    0 讨论(0)
  • 2020-12-07 11:18

    The new official AWS CLI natively supports most of the functionality of s3cmd. I'd previously been using s3cmd or the ruby AWS SDK to do things like this, but the official CLI works great for this.

    http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html

    aws s3 sync s3://oldbucket s3://newbucket
    
    0 讨论(0)
  • 2020-12-07 11:18

    Here is a ruby class for performing this: https://gist.github.com/4080793

    Example usage:

    $ gem install aws-sdk
    $ irb -r ./bucket_sync_service.rb
    > from_creds = {aws_access_key_id:"XXX",
                    aws_secret_access_key:"YYY",
                    bucket:"first-bucket"}
    > to_creds = {aws_access_key_id:"ZZZ",
                  aws_secret_access_key:"AAA",
                  bucket:"first-bucket"}
    > syncer = BucketSyncService.new(from_creds, to_creds)
    > syncer.debug = true # log each object
    > syncer.perform
    
    0 讨论(0)
提交回复
热议问题