How can I write a parquet file using Spark (pyspark)?

后端 未结 2 1268
一个人的身影
一个人的身影 2020-12-29 22:00

I\'m pretty new in Spark and I\'ve been trying to convert a Dataframe to a parquet file in Spark but I haven\'t had success yet. The documentation says that I can use

2条回答
  •  悲&欢浪女
    2020-12-29 22:40

    You can also write out Parquet files from Spark with koalas. This library is great for folks that prefer Pandas syntax. Koalas is PySpark under the hood.

    Here's the Koala code:

    import databricks.koalas as ks
    
    df = ks.read_csv('/temp/proto_temp.csv')
    df.to_parquet('output/proto.parquet')
    

    Read this blog post if you'd like more details.

提交回复
热议问题