How to read a csv file from an s3 bucket using Pandas in Python

前端 未结 5 2009
粉色の甜心
粉色の甜心 2020-12-08 17:16

I am trying to read a CSV file located in an AWS S3 bucket into memory as a pandas dataframe using the following code:

import pandas as pd
import boto

data          


        
5条回答
  •  伪装坚强ぢ
    2020-12-08 17:57

    Using pandas 0.20.3

    import os
    import boto3
    import pandas as pd
    import sys
    
    if sys.version_info[0] < 3: 
        from StringIO import StringIO # Python 2.x
    else:
        from io import StringIO # Python 3.x
    
    # get your credentials from environment variables
    aws_id = os.environ['AWS_ID']
    aws_secret = os.environ['AWS_SECRET']
    
    client = boto3.client('s3', aws_access_key_id=aws_id,
            aws_secret_access_key=aws_secret)
    
    bucket_name = 'my_bucket'
    
    object_key = 'my_file.csv'
    csv_obj = client.get_object(Bucket=bucket_name, Key=object_key)
    body = csv_obj['Body']
    csv_string = body.read().decode('utf-8')
    
    df = pd.read_csv(StringIO(csv_string))
    

提交回复
热议问题