I am trying to read a CSV file located in an AWS S3 bucket into memory as a pandas dataframe using the following code:
import pandas as pd
import boto
data
Based on this answer that suggested using smart_open for reading from S3, this is how I used it with Pandas:
import os
import pandas as pd
from smart_open import smart_open
aws_key = os.environ['AWS_ACCESS_KEY']
aws_secret = os.environ['AWS_SECRET_ACCESS_KEY']
bucket_name = 'my_bucket'
object_key = 'my_file.csv'
path = 's3://{}:{}@{}/{}'.format(aws_key, aws_secret, bucket_name, object_key)
df = pd.read_csv(smart_open(path))