I have more than 500,000 objects on s3. I am trying get the size of each object. I am using the following python code for that
Use the ContinuationToken returned in the response as a parameter for subsequent calls, until the IsTruncated value returned in the response is false.
This can be factored into a neat generator function:
def get_all_s3_objects(s3, **base_kwargs):
continuation_token = None
while True:
list_kwargs = dict(MaxKeys=1000, **base_kwargs)
if continuation_token:
list_kwargs['ContinuationToken'] = continuation_token
response = s3.list_objects_v2(**list_kwargs)
yield from response.get('Contents', [])
if not response.get('IsTruncated'): # At the end of the list?
break
continuation_token = response.get('NextContinuationToken')
for file in get_all_s3_objects(boto3.client('s3'), Bucket=bucket, Prefix=prefix):
print(file['size'])