How to get size of all files in an S3 bucket with versioning?

筅森魡賤 提交于 2020-12-03 07:45:16

问题


I know this command can provide the size of all files in a bucket:

aws s3 ls mybucket --recursive --summarize --human-readable

But this does not account for versioning.

If I run this command:

aws s3 ls s3://mybucket/myfile --human-readable

It will show something like "100 MiB" but it may have 10 versions of this file which will be more like "1 GiB" total.

The closest I have is getting the sizes of every version of a given file:

aws s3api list-object-versions --bucket mybucket --prefix "myfile" --query 'Versions[?StorageClass=`STANDARD`].Size' > /tmp/s3_myfile_version_sizes

Then take the sum of all version sizes.

But I would have to rerun this command for every file in a bucket.

Is there an easier way to do this?


回答1:


You can run list-object-versions on the bucket as a whole:

aws s3api list-object-versions --bucket my-bucket --query 'Versions[*].Size'

Use jq to sum it up:

aws s3api list-object-versions --bucket my-bucket --query 'Versions[*].Size' | jq add

Or, if you need a human readable output:

aws s3api list-object-versions --bucket my-bucket --query 'Versions[*].Size' | jq add | numfmt  --to=iec-i --suffix=B

You can also add a prefix in case you want to know the size of a given "folder" and maybe get also the number of version objects:

aws s3api list-object-versions --bucket my-bucket --prefix my-folder --query 'Versions[*].Size' | jq 'length|add'

Or you can use jq filtering to write more complex filters, for example, including only non-current objects:

aws s3api list-object-versions --bucket my-bucket --prefix my-folder | jq '[.Versions[]|select(.IsLatest == false)|.Size] | length,add'

If jq is not available, using the --output text option unfortunately results in tab-separated values, so here's a hack to force it to separate lines and then add up the total:

aws s3api list-object-versions --bucket my-bucket --query 'Versions[*].[Size,Size]' --output text  | awk '{s+=$1} END {printf "%.0f", s}'

If you have a large number of objects, it might be better to use data provided by the Amazon S3 Storage Inventory:

Amazon S3 inventory provides a comma-separated values (CSV) flat-file output of your objects and their corresponding metadata on a daily or weekly basis for an S3 bucket or a shared prefix (that is, objects that have names that begin with a common string).




回答2:


Use CloudWatch, it will give result with all versioning.



来源:https://stackoverflow.com/questions/43150572/how-to-get-size-of-all-files-in-an-s3-bucket-with-versioning

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!