How to read a Parquet file into Pandas DataFrame?

后端 未结 3 943
醉酒成梦
醉酒成梦 2020-12-07 18:03

How to read a modestly sized Parquet data-set into an in-memory Pandas DataFrame without setting up a cluster computing infrastructure such as Hadoop or Spark? This is only

3条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-12-07 18:44

    Aside from pandas, Apache pyarrow also provides way to transform parquet to dataframe

    The code is simple, just type:

    import pyarrow.parquet as pq
    
    df = pq.read_table(source=your_file_path).to_pandas()
    

    For more information, see the document from Apache pyarrow Reading and Writing Single Files

提交回复
热议问题