How to download Amazon S3 files on to local machine in folder using python and boto3?

半城伤御伤魂 提交于 2020-01-07 08:25:12

问题


I am trying to download a file from Amazon S3 to a predefined folder in the local machine. This is the code and it works fine. But when the file is saved, it saves with lastname of the path. How should I correct this?

import boto3
import os

S3_Object = boto3.client('s3', aws_access_key_id='##', aws_secret_access_key='##')
BUCKET_NAME = '##'
filename2 = []
Key2 = []
bucket = S3_Object.list_objects(Bucket=BUCKET_NAME)['Contents']
download_path = target_file_path = os.path.join('..', 'data', 'lz', 'test_sample', 'sample_file' )

for key in bucket:
    path, filename = os.path.split(key['Key'])
    filename2.append(filename)
    Key2.append(key['Key'])

for f in Key2:
    if f.endswith('.csv'):
        #if f.endswith('.csv'):
            print(f)           
            file_name = str(f.rsplit('/', 1)[-1])
            print(file_name)
            if not os.path.exists(download_path):
                os.makedirs(download_path)
            else:
                S3_Object.download_file(BUCKET_NAME, f, download_path + file_name)
                print("success")

回答1:


Here is my test code.

import boto3
import os

s3 = boto3.resource('s3')
bucket = 'your bucket'
response = s3.Bucket(bucket).objects.all()
# If you want to search only specific path of bucket,
#response = s3.Bucket(bucket).objects.filter(Prefix='path')

path = 'your path'
if not os.path.exists(path):
    os.makedirs(path)

for item in response:
    filename = item.key.rsplit('/', 1)[-1]
    if filename.endswith('.csv'):
        s3.Object(bucket, item.key).download_file(path + filename)
        print("success")

I have tested the code and it gives a correct name.


What is wrong?

I think, there is a missing / in your code for the path.

print(os.path.join('..', 'data', 'lz', 'test_sample', 'sample_file'))

The code gives the result:

../data/lz/test_sample/sample_file

So, in the below step,

S3_Object.download_file(BUCKET_NAME, f, download_path + file_name)

the download_path + file_name will be wrong and it should be:

S3_Object.download_file(BUCKET_NAME, f, download_path + '/' + file_name)



回答2:


the following function downloadS recursively the files.

The directories are created locally only if they contain files.

import boto3
import os

def download_dir(client, resource, dist, local='/tmp', bucket='your_bucket'):
paginator = client.get_paginator('list_objects')
for result in paginator.paginate(Bucket=bucket, Delimiter='/', Prefix=dist):
    if result.get('CommonPrefixes') is not None:
        for subdir in result.get('CommonPrefixes'):
            download_dir(client, resource, subdir.get('Prefix'), local, bucket)
    for file in result.get('Contents', []):
        dest_pathname = os.path.join(local, file.get('Key'))
        if not os.path.exists(os.path.dirname(dest_pathname)):
            os.makedirs(os.path.dirname(dest_pathname))
        resource.meta.client.download_file(bucket, file.get('Key'), dest_pathname)

The function is called that way:

def _start():
    client = boto3.client('s3')
    resource = boto3.resource('s3')
    download_dir(client, resource, 'clientconf/', '/tmp', bucket='my-bucket')


来源:https://stackoverflow.com/questions/57979867/how-to-download-amazon-s3-files-on-to-local-machine-in-folder-using-python-and-b

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!