Reading a file from a private S3 bucket to a pandas dataframe

前端未结

关注

 8  774

猫巷女王i 2020-12-08 10:19

I\'m trying to read a CSV file from a private S3 bucket to a pandas dataframe:

df = pandas.read_csv(\'s3://mybucket/file.csv\')

I can read

8条回答

孤城傲影 (楼主)

2020-12-08 10:26

Updated for Pandas 0.20.1

Pandas now uses s3fs to handle s3 coonnections. link

pandas now uses s3fs for handling S3 connections. This shouldn’t break any code. However, since s3fs is not a required dependency, you will need to install it separately, like boto in prior versions of pandas.

import os

import pandas as pd
from s3fs.core import S3FileSystem

# aws keys stored in ini file in same path
# refer to boto3 docs for config settings
os.environ['AWS_CONFIG_FILE'] = 'aws_config.ini'

s3 = S3FileSystem(anon=False)
key = 'path\to\your-csv.csv'
bucket = 'your-bucket-name'

df = pd.read_csv(s3.open('{}/{}'.format(bucket, key),
                         mode='rb')
                 )

0 讨论(0)

查看其它8个回答