How to get more than 1000 objects from S3 by using list_objects_v2?

后端 未结 3 1950
挽巷
挽巷 2021-01-01 09:55

I have more than 500,000 objects on s3. I am trying get the size of each object. I am using the following python code for that

3条回答
  •  心在旅途
    2021-01-01 10:27

    Use the ContinuationToken returned in the response as a parameter for subsequent calls, until the IsTruncated value returned in the response is false.

    This can be factored into a neat generator function:

    def get_all_s3_objects(s3, **base_kwargs):
        continuation_token = None
        while True:
            list_kwargs = dict(MaxKeys=1000, **base_kwargs)
            if continuation_token:
                list_kwargs['ContinuationToken'] = continuation_token
            response = s3.list_objects_v2(**list_kwargs)
            yield from response.get('Contents', [])
            if not response.get('IsTruncated'):  # At the end of the list?
                break
            continuation_token = response.get('NextContinuationToken')
    
    for file in get_all_s3_objects(boto3.client('s3'), Bucket=bucket, Prefix=prefix):
        print(file['size'])
    

提交回复
热议问题