Throttling S3 commands with aws cli

问题

I'm running a backup script using AWS CLI to perform an S3 sync command every night on my MediaTemple server. This has run without fail for months, but I updated my Plesk installation and now every night, when the backup script runs, MediaTemple disables my server due to excessive usage. The limits I seem to be crossing are as follows:

RESOURCE INFO:
Packets per second limit: 35000
Packets per second detected: 42229.11667000000306870788
Bytes per second limit: 50000000
Bytes per second detected: 61801446.10000000149011611938

They also include a networking snapshot at the time they take the server offline which includes many open connections to Amazon IP addresses (9 at time of the snapshot).

Is there anything I can do to throttle the connections to AWS? Preferably I'm looking for an option within the AWS API (though I haven't seen anything useful in the documentation), but barring that, is there something I can do on my end to manage the connections at the network level?

回答1:

As well as changing the max default connections and chunk size already mentions you can also set the max_bandwidth. This is very effective when uploading large single files.

aws configure set default.s3.max_bandwidth 50MB/s

回答2:

The AWS CLI S3 transfer commands (which includes sync) have the following relevant configuration options:

max_concurrent_requests -
Default: 10

The maximum number of concurrent requests.

multipart_threshold -
Default: 8MB

The size threshold the CLI uses for multipart transfers of individual files.

multipart_chunksize -
Default: 8MB

When using multipart transfers, this is the chunk size that the CLI uses for multipart transfers of individual files.

This isn't so granular as throttling packets per second, but it seems like setting a lower concurrent request value and lowering both multipart threshold and chunksize will help. If the values you pasted are close to average, I would start with these values and tweak until you're reliably not exceeding the limits anymore:

$ aws configure set default.s3.max_concurrent_requests 8
$ aws configure set default.s3.multipart_threshold 6MB
$ aws configure set default.s3.multipart_chunksize 6MB

回答3:

I ended up using Trickle and capping download & upload speeds at 20,000 kb/s. This let me use my existing script without much modification (all I had to do was add the trickle call to the beginning of the command).

Also, it looks like bandwidth throttling has been added as an issue to AWS CLI, so hopefully this will all be a non-issue for folks if that gets implemented.

回答4:

If you can not make trickle work with aws s3 command like me, you may use:

sudo apt-get install pv (or yum install pv) pv -L 1M local_filename 2>/dev/null | aws s3 cp - s3://bucket_name/remote_filename

where -L 1M limits the bandwidth to 1M/s and the dash after cp indicate stdin

Note: the awscli from apt-get is too old to support the stdin input, you need to upgrade it via pip

回答5:

I could not get trickle to work with aws-cli, but came across s3cmd which works great for me. It has an option to rate limit. It can be found in the Fedora repos, and I imagine other distros have it packaged too.

s3cmd --progress --stats --limit-rate=50k sync ./my_photos/ s3://mybucket

s3cmd man page

来源：https://stackoverflow.com/questions/30620402/throttling-s3-commands-with-aws-cli

标签

amazon-web-services

amazon-s3

aws-cli