How to read a list of parquet files from S3 as a pandas dataframe using pyarrow?

后端 未结 7 1744
小蘑菇
小蘑菇 2020-12-04 09:15

I have a hacky way of achieving this using boto3 (1.4.4), pyarrow (0.4.1) and pandas (0.20.3).

First, I can read a single parq

7条回答
  •  情话喂你
    2020-12-04 09:34

    If you are open to also use AWS Data Wrangler.

    import awswrangler as wr
    
    df = wr.s3.read_parquet(path="s3://...")
    

提交回复
热议问题