Throttling S3 commands with aws cli

不问归期 提交于 2019-12-21 03:29:24

问题


I'm running a backup script using AWS CLI to perform an S3 sync command every night on my MediaTemple server. This has run without fail for months, but I updated my Plesk installation and now every night, when the backup script runs, MediaTemple disables my server due to excessive usage. The limits I seem to be crossing are as follows:

RESOURCE INFO:
Packets per second limit: 35000
Packets per second detected: 42229.11667000000306870788
Bytes per second limit: 50000000
Bytes per second detected: 61801446.10000000149011611938

They also include a networking snapshot at the time they take the server offline which includes many open connections to Amazon IP addresses (9 at time of the snapshot).

Is there anything I can do to throttle the connections to AWS? Preferably I'm looking for an option within the AWS API (though I haven't seen anything useful in the documentation), but barring that, is there something I can do on my end to manage the connections at the network level?


回答1:


As well as changing the max default connections and chunk size already mentions you can also set the max_bandwidth. This is very effective when uploading large single files.

aws configure set default.s3.max_bandwidth 50MB/s




回答2:


The AWS CLI S3 transfer commands (which includes sync) have the following relevant configuration options:

  • max_concurrent_requests -
    • Default: 10
    • The maximum number of concurrent requests.
  • multipart_threshold -
    • Default: 8MB
    • The size threshold the CLI uses for multipart transfers of individual files.
  • multipart_chunksize -
    • Default: 8MB
    • When using multipart transfers, this is the chunk size that the CLI uses for multipart transfers of individual files.

This isn't so granular as throttling packets per second, but it seems like setting a lower concurrent request value and lowering both multipart threshold and chunksize will help. If the values you pasted are close to average, I would start with these values and tweak until you're reliably not exceeding the limits anymore:

$ aws configure set default.s3.max_concurrent_requests 8
$ aws configure set default.s3.multipart_threshold 6MB
$ aws configure set default.s3.multipart_chunksize 6MB



回答3:


I ended up using Trickle and capping download & upload speeds at 20,000 kb/s. This let me use my existing script without much modification (all I had to do was add the trickle call to the beginning of the command).

Also, it looks like bandwidth throttling has been added as an issue to AWS CLI, so hopefully this will all be a non-issue for folks if that gets implemented.




回答4:


If you can not make trickle work with aws s3 command like me, you may use:

sudo apt-get install pv (or yum install pv) pv -L 1M local_filename 2>/dev/null | aws s3 cp - s3://bucket_name/remote_filename

where -L 1M limits the bandwidth to 1M/s and the dash after cp indicate stdin

Note: the awscli from apt-get is too old to support the stdin input, you need to upgrade it via pip




回答5:


I could not get trickle to work with aws-cli, but came across s3cmd which works great for me. It has an option to rate limit. It can be found in the Fedora repos, and I imagine other distros have it packaged too.

s3cmd --progress --stats --limit-rate=50k sync ./my_photos/ s3://mybucket

s3cmd man page



来源:https://stackoverflow.com/questions/30620402/throttling-s3-commands-with-aws-cli

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!