I have a hacky way of achieving this using boto3
(1.4.4), pyarrow
(0.4.1) and pandas
(0.20.3).
First, I can read a single parq
It can be done using boto3 as well without the use of pyarrow
import boto3
import io
import pandas as pd
# Read the parquet file
buffer = io.BytesIO()
s3 = boto3.resource('s3')
object = s3.Object('bucket_name','key')
object.download_fileobj(buffer)
df = pd.read_parquet(buffer)
print(df.head())