How can I access s3 files in Python using urls?

后端 未结 6 1171
一个人的身影
一个人的身影 2020-12-14 16:44

I want to write a Python script that will read and write files from s3 using their url\'s, eg:\'s3:/mybucket/file\'. It would need to run locally and in the cloud without a

相关标签:
6条回答
  • 2020-12-14 16:50

    http://s3tools.org/s3cmd works pretty well and support the s3:// form of the URL structure you want. It does the business on Linux and Windows. If you need a native API to call from within a python program then http://code.google.com/p/boto/ is a better choice.

    0 讨论(0)
  • 2020-12-14 16:56

    Here's how they do it in awscli :

    def find_bucket_key(s3_path):
        """
        This is a helper function that given an s3 path such that the path is of
        the form: bucket/key
        It will return the bucket and the key represented by the s3 path
        """
        s3_components = s3_path.split('/')
        bucket = s3_components[0]
        s3_key = ""
        if len(s3_components) > 1:
            s3_key = '/'.join(s3_components[1:])
        return bucket, s3_key
    
    
    def split_s3_bucket_key(s3_path):
        """Split s3 path into bucket and key prefix.
        This will also handle the s3:// prefix.
        :return: Tuple of ('bucketname', 'keyname')
        """
        if s3_path.startswith('s3://'):
            s3_path = s3_path[5:]
        return find_bucket_key(s3_path)
    

    Which you could just use with code like this

    from awscli.customizations.s3.utils import split_s3_bucket_key
    import boto3
    client = boto3.client('s3')
    bucket_name, key_name = split_s3_bucket_key(
        's3://example-bucket-name/path/to/example.txt')
    response = client.get_object(Bucket=bucket_name, Key=key_name)
    

    This doesn't address the goal of interacting with an s3 key as a file like object but it's a step in that direction.

    0 讨论(0)
  • 2020-12-14 17:01

    I haven't seen something that would work directly with S3 urls, but you could use an S3 access library (simples3 looks decent) and some simple string manipulation:

    >>> url = "s3:/bucket/path/"
    >>> _, path = url.split(":", 1)
    >>> path = path.lstrip("/")
    >>> bucket, path = path.split("/", 1)
    >>> print bucket
    'bucket'
    >>> print path
    'path/'
    
    0 讨论(0)
  • 2020-12-14 17:05

    For opening, it should be as simple as:

    import urllib
    opener = urllib.URLopener()
    myurl = "https://s3.amazonaws.com/skyl/fake.xyz"
    myfile = opener.open(myurl)
    

    This will work with s3 if the file is public.

    To write a file using boto, it goes a little something like this:

    from boto.s3.connection import S3Connection
    conn = S3Connection(AWS_KEY, AWS_SECRET)
    bucket = conn.get_bucket(BUCKET)
    destination = bucket.new_key()
    destination.name = filename
    destination.set_contents_from_file(myfile)
    destination.make_public()
    

    lemme know if this works for you :)

    0 讨论(0)
  • 2020-12-14 17:07

    You can use Boto Python API for accessing S3 by python. Its a good library. After you do the installation of Boto, following sample programe will work for you

    >>> k = Key(b)
    >>> k.key = 'yourfile'
    >>> k.set_contents_from_filename('yourfile.txt')
    

    You can find more information here http://boto.cloudhackers.com/s3_tut.html#storing-data

    0 讨论(0)
  • 2020-12-14 17:12

    Try s3fs

    First example on the docs:

    >>> import s3fs
    >>> fs = s3fs.S3FileSystem(anon=True)
    >>> fs.ls('my-bucket')
    ['my-file.txt']
    >>> with fs.open('my-bucket/my-file.txt', 'rb') as f:
    ...     print(f.read())
    b'Hello, world'
    
    0 讨论(0)
提交回复
热议问题