How to read a csv file from an s3 bucket using Pandas in Python

前端 未结 5 2023
粉色の甜心
粉色の甜心 2020-12-08 17:16

I am trying to read a CSV file located in an AWS S3 bucket into memory as a pandas dataframe using the following code:

import pandas as pd
import boto

data          


        
5条回答
  •  暖寄归人
    2020-12-08 17:59

    Based on this answer that suggested using smart_open for reading from S3, this is how I used it with Pandas:

    import os
    import pandas as pd
    from smart_open import smart_open
    
    aws_key = os.environ['AWS_ACCESS_KEY']
    aws_secret = os.environ['AWS_SECRET_ACCESS_KEY']
    
    bucket_name = 'my_bucket'
    object_key = 'my_file.csv'
    
    path = 's3://{}:{}@{}/{}'.format(aws_key, aws_secret, bucket_name, object_key)
    
    df = pd.read_csv(smart_open(path))
    

提交回复
热议问题