How to export data from Spark SQL to CSV

前端 未结 7 1255
迷失自我
迷失自我 2020-12-04 15:31

This command works with HiveQL:

insert overwrite directory \'/data/home.csv\' select * from testtable;

But with Spark SQL I\'m getting an e

7条回答
  •  日久生厌
    2020-12-04 16:09

    The answer above with spark-csv is correct but there is an issue - the library creates several files based on the data frame partitioning. And this is not what we usually need. So, you can combine all partitions to one:

    df.coalesce(1).
        write.
        format("com.databricks.spark.csv").
        option("header", "true").
        save("myfile.csv")
    

    and rename the output of the lib (name "part-00000") to a desire filename.

    This blog post provides more details: https://fullstackml.com/2015/12/21/how-to-export-data-frame-from-apache-spark/

提交回复
热议问题