Save a large Spark Dataframe as a single json file in S3

前端 未结 3 2022
星月不相逢
星月不相逢 2021-02-01 20:39

Im trying to save a Spark DataFrame (of more than 20G) to a single json file in Amazon S3, my code to save the dataframe is like this :

dataframe.repartition(1)         


        
3条回答
  •  耶瑟儿~
    2021-02-01 21:13

    Try this

    dataframe.write.format("org.apache.spark.sql.json").mode(SaveMode.Append).save("hdfs://localhost:9000/sampletext.txt");
    

提交回复
热议问题