How to download the latest file of an S3 bucket using Boto3?

后端 未结 7 1048
借酒劲吻你
借酒劲吻你 2021-01-11 23:22

The other questions I could find were refering to an older version of Boto. I would like to download the latest file of an S3 bucket. In the documentation I found that there

7条回答
  •  盖世英雄少女心
    2021-01-11 23:54

    This handles when there are more than 1000 objects in the s3 bucket. This is basically @SaadK answer without the for loop and using newer version for list_objects_v2.

    EDIT: Fixes issue @Timothée-Jeannin identified. Ensures that latest across all pages is identified.

    https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Paginator.ListObjectsV2

    import boto3
    
    def get_most_recent_s3_object(bucket_name, prefix):
        s3 = boto3.client('s3')
        paginator = s3.get_paginator( "list_objects_v2" )
        page_iterator = paginator.paginate(Bucket=bucket_name, Prefix=prefix)
        latest = None
        for page in page_iterator:
            if "Contents" in page:
                latest2 = max(page['Contents'], key=lambda x: x['LastModified'])
                if latest is None or latest2['LastModified'] > latest['LastModified']:
                    latest = latest2
        return latest
    
    latest = get_most_recent_s3_object(bucket_name, prefix)
    
    latest['Key']  # -->   'prefix/objectname'
    

提交回复
热议问题