Methods for writing Parquet files using Python?

后端 未结 6 782
再見小時候
再見小時候 2021-02-02 09:30

I\'m having trouble finding a library that allows Parquet files to be written using Python. Bonus points if I can use Snappy or a similar compression mechanism in conjunction wi

6条回答
  •  故里飘歌
    2021-02-02 10:02

    I've written a comprehensive guide to Python and Parquet with an emphasis on taking advantage of Parquet's three primary optimizations: columnar storage, columnar compression and data partitioning. There is a fourth optimization that isn't covered yet, row groups, but they aren't commonly used. The ways of working with Parquet in Python are pandas, PyArrow, fastparquet, PySpark, Dask and AWS Data Wrangler.

    Check out the post here: Python and Parquet Performance In Pandas, PyArrow, fastparquet, AWS Data Wrangler, PySpark and Dask

提交回复
热议问题