I\'ve just started to experiment with AWS SageMaker and would like to load data from an S3 bucket into a pandas dataframe in my SageMaker python jupyter notebook for analysi
In the simplest case you don't need boto3
, because you just read resources.
Then it's even simpler:
import pandas as pd
bucket='my-bucket'
data_key = 'train.csv'
data_location = 's3://{}/{}'.format(bucket, data_key)
pd.read_csv(data_location)
But as Prateek stated make sure to configure your SageMaker notebook instance. to have access to s3. This is done at configuration step in Permissions > IAM role