check if a key exists in a bucket in s3 using boto3

前端未结

关注

 24  2805

花落未央

I would like to know if a key exists in boto3. I can loop the bucket contents and check the key if it matches.

But that seems longer and an overkill. Boto3 official

相关标签:

24条回答

遇见更好的自我

2020-11-28 19:44

From https://www.peterbe.com/plog/fastest-way-to-find-out-if-a-file-exists-in-s3 this is pointed out to be the fastest method:

import boto3

boto3_session = boto3.session.Session()
s3_session_client = boto3_session.client("s3")
response = s3_session_client.list_objects_v2(
    Bucket=bc_df_caches_bucket, Prefix=s3_key
)
for obj in response.get("Contents", []):
    if obj["Key"] == s3_key:
        return True
return False

0 讨论(0)

忘掉有多难

2020-11-28 19:45

FWIW, here are the very simple functions that I am using

import boto3

def get_resource(config: dict={}):
    """Loads the s3 resource.

    Expects AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to be in the environment
    or in a config dictionary.
    Looks in the environment first."""

    s3 = boto3.resource('s3',
                        aws_access_key_id=os.environ.get(
                            "AWS_ACCESS_KEY_ID", config.get("AWS_ACCESS_KEY_ID")),
                        aws_secret_access_key=os.environ.get("AWS_SECRET_ACCESS_KEY", config.get("AWS_SECRET_ACCESS_KEY")))
    return s3


def get_bucket(s3, s3_uri: str):
    """Get the bucket from the resource.
    A thin wrapper, use with caution.

    Example usage:

    >> bucket = get_bucket(get_resource(), s3_uri_prod)"""
    return s3.Bucket(s3_uri)


def isfile_s3(bucket, key: str) -> bool:
    """Returns T/F whether the file exists."""
    objs = list(bucket.objects.filter(Prefix=key))
    return len(objs) == 1 and objs[0].key == key


def isdir_s3(bucket, key: str) -> bool:
    """Returns T/F whether the directory exists."""
    objs = list(bucket.objects.filter(Prefix=key))
    return len(objs) > 1

0 讨论(0)

再見小時候

2020-11-28 19:47

There is one simple way by which we can check if file exists or not in S3 bucket. We donot need to use exception for this

sesssion = boto3.Session(aws_access_key_id, aws_secret_access_key)
s3 = session.client('s3')

object_name = 'filename'
bucket = 'bucketname'
obj_status = s3.list_objects(Bucket = bucket, Prefix = object_name)
if obj_status.get('Contents'):
    print("File exists")
else:
    print("File does not exists")

0 讨论(0)

别跟我提以往

2020-11-28 19:47
Use this concise oneliner, makes it less intrusive when you have to throw it inside an existing project without modifying much of the code.
```
s3_file_exists = lambda filename: bool(list(bucket.objects.filter(Prefix=filename)))
```
The above function assumes the bucket variable was already declared.

You can extend the lambda to support additional parameter like
```
s3_file_exists = lambda filename, bucket: bool(list(bucket.objects.filter(Prefix=filename)))
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
误落风尘

2020-11-28 19:48
If you have less than 1000 in a directory or bucket you can get set of them and after check if such key in this set:
```
files_in_dir = {d['Key'].split('/')[-1] for d in s3_client.list_objects_v2(
Bucket='mybucket',
Prefix='my/dir').get('Contents') or []}
```
Such code works even if my/dir is not exists.

http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.list_objects_v2
0 讨论(0)
发布评论:

提交评论
- 加载中...

-上瘾入骨i

2020-11-28 19:50

I noticed that just for catching the exception using botocore.exceptions.ClientError we need to install botocore. botocore takes up 36M of disk space. This is particularly impacting if we use aws lambda functions. In place of that if we just use exception then we can skip using the extra library!

I am validating for the file extension to be '.csv'
This will not throw an exception if the bucket does not exist!
This will not throw an exception if the bucket exists but object does not exist!
This throws out an exception if the bucket is empty!
This throws out an exception if the bucket has no permissions!

The code looks like this. Please share your thoughts:

import boto3
import traceback

def download4mS3(s3bucket, s3Path, localPath):
    s3 = boto3.resource('s3')

    print('Looking for the csv data file ending with .csv in bucket: ' + s3bucket + ' path: ' + s3Path)
    if s3Path.endswith('.csv') and s3Path != '':
        try:
            s3.Bucket(s3bucket).download_file(s3Path, localPath)
        except Exception as e:
            print(e)
            print(traceback.format_exc())
            if e.response['Error']['Code'] == "404":
                print("Downloading the file from: [", s3Path, "] failed")
                exit(12)
            else:
                raise
        print("Downloading the file from: [", s3Path, "] succeeded")
    else:
        print("csv file not found in in : [", s3Path, "]")
        exit(12)

0 讨论(0)

上一页 1 2 3 4