I\'m trying to read a CSV file from a private S3 bucket to a pandas dataframe:
df = pandas.read_csv(\'s3://mybucket/file.csv\')
I can read
Updated for Pandas 0.20.1
Pandas now uses s3fs to handle s3 coonnections. link
pandas now uses s3fs for handling S3 connections. This shouldn’t break any code. However, since s3fs is not a required dependency, you will need to install it separately, like boto in prior versions of pandas.
import os
import pandas as pd
from s3fs.core import S3FileSystem
# aws keys stored in ini file in same path
# refer to boto3 docs for config settings
os.environ['AWS_CONFIG_FILE'] = 'aws_config.ini'
s3 = S3FileSystem(anon=False)
key = 'path\to\your-csv.csv'
bucket = 'your-bucket-name'
df = pd.read_csv(s3.open('{}/{}'.format(bucket, key),
mode='rb')
)