How to update metadata of an existing object in AWS S3 using python boto3?

后端 未结 3 1014
佛祖请我去吃肉
佛祖请我去吃肉 2020-12-01 14:19

boto3 documentation does not clearly specify how to update the user metadata of an already existing S3 Object.

相关标签:
3条回答
  • 2020-12-01 15:04

    It can be done using the copy_from() method -

    import boto3
    
    s3 = boto3.resource('s3')
    s3_object = s3.Object('bucket-name', 'key')
    s3_object.metadata.update({'id':'value'})
    s3_object.copy_from(CopySource={'Bucket':'bucket-name', 'Key':'key'}, Metadata=s3_object.metadata, MetadataDirective='REPLACE')
    
    0 讨论(0)
  • 2020-12-01 15:13

    You can either update metadata by adding something or updating a current metadata value with a new one, here is the piece of code I am using :

    import sys
    import os 
    import boto3
    import pprint
    from boto3 import client
    from botocore.utils import fix_s3_host
    param_1= YOUR_ACCESS_KEY
    param_2= YOUR_SECRETE_KEY
    param_3= YOUR_END_POINT 
    param_4= YOUR_BUCKET
    
    #Create the S3 client
    s3ressource = client(
        service_name='s3', 
        endpoint_url= param_3,
        aws_access_key_id= param_1,
        aws_secret_access_key=param_2,
        use_ssl=True,
        )
    # Building a list of of object per bucket
    def BuildObjectListPerBucket (variablebucket):
        global listofObjectstobeanalyzed
        listofObjectstobeanalyzed = []
        extensions = ['.jpg','.png']
        for key  in s3ressource.list_objects(Bucket=variablebucket)["Contents"]:
            #print (key ['Key'])
            onemoreObject=key['Key']
            if onemoreObject.endswith(tuple(extensions)):
                listofObjectstobeanalyzed.append(onemoreObject)
        #print listofObjectstobeanalyzed
            else :
                s3ressource.delete_object(Bucket=variablebucket,Key=onemoreObject)          
        return listofObjectstobeanalyzed
    
    # for a given existing object, create metadata
    def createmetdata(bucketname,objectname):
        s3ressource.upload_file(objectname, bucketname, objectname, ExtraArgs={"Metadata": {"metadata1":"ImageName","metadata2":"ImagePROPERTIES" ,"metadata3":"ImageCREATIONDATE"}})
    
    # for a given existing object, add new metadata
    def ADDmetadata(bucketname,objectname):
        s3_object = s3ressource.get_object(Bucket=bucketname, Key=objectname)
        k = s3ressource.head_object(Bucket = bucketname, Key = objectname)
        m = k["Metadata"]
        m["new_metadata"] = "ImageNEWMETADATA"
        s3ressource.copy_object(Bucket = bucketname, Key = objectname, CopySource = bucketname + '/' + objectname, Metadata = m, MetadataDirective='REPLACE')
    
    # for a given existing object, update  a metadata with new value
    def CHANGEmetadata(bucketname,objectname):
        s3_object = s3ressource.get_object(Bucket=bucketname, Key=objectname)
        k = s3ressource.head_object(Bucket = bucketname, Key = objectname)
        m = k["Metadata"]
        m.update({'watson_visual_rec_dic':'ImageCREATIONDATEEEEEEEEEEEEEEEEEEEEEEEEEE'})
        s3ressource.copy_object(Bucket = bucketname, Key = objectname, CopySource = bucketname + '/' + objectname, Metadata = m, MetadataDirective='REPLACE')
    
    def readmetadata (bucketname,objectname):
        ALLDATAOFOBJECT = s3ressource.get_object(Bucket=bucketname, Key=objectname)
        ALLDATAOFOBJECTMETADATA=ALLDATAOFOBJECT['Metadata']
        print ALLDATAOFOBJECTMETADATA
    
    
    
    # create the list of object on a per bucket basis
    BuildObjectListPerBucket (param_4)
    
    # Call functions to see the results 
    for objectitem in listofObjectstobeanalyzed:
        # CALL The function you want 
        readmetadata(param_4,objectitem)
        ADDmetadata(param_4,objectitem)
        readmetadata(param_4,objectitem)
        CHANGEmetadata(param_4,objectitem)
        readmetadata(param_4,objectitem)
    
    0 讨论(0)
  • 2020-12-01 15:21

    You can do this using copy_from() on the resource (like this answer) mentions, but you can also use the client's copy_object() and specify the same source and destination. The methods are equivalent and invoke the same code underneath.

    import boto3
    s3 = boto3.client("s3")
    src_key = "my-key"
    src_bucket = "my-bucket"
    s3.copy_object(Key=src_key, Bucket=src_bucket,
                   CopySource={"Bucket": src_bucket, "Key": src_key},
                   Metadata={"my_new_key": "my_new_val"},
                   MetadataDirective="REPLACE")
    

    The 'REPLACE' value specifies that the metadata passed in the request should overwrite the source metadata entirely. If you mean to only add new key-values, or delete only some keys, you'd have to first read the original data, edit it and call the update.

    To replacing only a subset of the metadata correctly:

    1. Retrieve the original metadata with head_object(Key=src_key, Bucket=src_bucket). Also take note of the Etag in the response
    2. Make desired changes to the metadata locally.
    3. Call copy_object as above to upload the new metadata, but pass CopySourceIfMatch=original_etag in the request to ensure the remote object has the metadata you expect before overwriting it. original_etag is the one you got in step 1. In case the metadata (or the data itself) has changed since head_object was called (e.g. by another program running simultaneously), copy_object will fail with an HTTP 412 error.

    Reference: boto3 issue 389

    0 讨论(0)
提交回复
热议问题