How to avoid empty files while writing parquet files?

后端未结

关注

 4  920

囚心锁ツ 2021-01-16 07:15

I am reading from Kafka queue using Spark Structured Streaming. After reading from Kafka I am applying filter on the dataframe. I am saving this fi

4条回答

醉话见心 (楼主)

2021-01-16 07:46

you can try with repartitionByRange(column)..

I used this while writing dataframe to HDFS .. It solved my empty file creation issue.

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...