发表新帖

发表新帖

How to export data from Spark SQL to CSV

前端未结

关注

 7  1255

迷失自我 2020-12-04 15:31

This command works with HiveQL:

insert overwrite directory \'/data/home.csv\' select * from testtable;

But with Spark SQL I\'m getting an e

7条回答

日久生厌 (楼主)

2020-12-04 16:09
The answer above with spark-csv is correct but there is an issue - the library creates several files based on the data frame partitioning. And this is not what we usually need. So, you can combine all partitions to one:
```
df.coalesce(1).
    write.
    format("com.databricks.spark.csv").
    option("header", "true").
    save("myfile.csv")
```
and rename the output of the lib (name "part-00000") to a desire filename.

This blog post provides more details: https://fullstackml.com/2015/12/21/how-to-export-data-frame-from-apache-spark/
0 讨论(0)

查看其它7个回答
发布评论:

提交评论
- 加载中...

热议问题