发表新帖

发表新帖

Write single CSV file using spark-csv

前端未结

关注

 13  2208

心在旅途 2020-11-22 08:43

I am using https://github.com/databricks/spark-csv , I am trying to write a single CSV, but not able to, it is making a folder.

Need a Scala function which will take

13条回答

夕颜 (楼主)

2020-11-22 09:09
spark's df.write() API will create multiple part files inside given path ... to force spark write only a single part file use df.coalesce(1).write.csv(...) instead of df.repartition(1).write.csv(...) as coalesce is a narrow transformation whereas repartition is a wide transformation see Spark - repartition() vs coalesce()
```
df.coalesce(1).write.csv(filepath,header=True) 
```
will create folder in given filepath with one part-0001-...-c000.csv file use
```
cat filepath/part-0001-...-c000.csv > filename_you_want.csv 
```
to have a user friendly filename
0 讨论(0)

查看其它13个回答
发布评论:

提交评论
- 加载中...

热议问题