Reading a file from a private S3 bucket to a pandas dataframe

前端 未结 8 778
猫巷女王i
猫巷女王i 2020-12-08 10:19

I\'m trying to read a CSV file from a private S3 bucket to a pandas dataframe:

df = pandas.read_csv(\'s3://mybucket/file.csv\')

I can read

8条回答
  •  被撕碎了的回忆
    2020-12-08 10:37

    Pandas uses boto (not boto3) inside read_csv. You might be able to install boto and have it work correctly.

    There's some troubles with boto and python 3.4.4 / python3.5.1. If you're on those platforms, and until those are fixed, you can use boto 3 as

    import boto3
    import pandas as pd
    
    s3 = boto3.client('s3')
    obj = s3.get_object(Bucket='bucket', Key='key')
    df = pd.read_csv(obj['Body'])
    

    That obj had a .read method (which returns a stream of bytes), which is enough for pandas.

提交回复
热议问题