Can I write a plain text HDFS (or local) file from a Spark program, not from an RDD?

后端 未结 4 993
别跟我提以往
别跟我提以往 2020-12-29 13:08

I have a Spark program (in Scala) and a SparkContext. I am writing some files with RDD\'s saveAsTextFile. On my local machine I can us

4条回答
  •  一向
    一向 (楼主)
    2020-12-29 13:44

    Here's what worked best for me (using Spark 2.0):

    val path = new Path("hdfs://namenode:8020/some/folder/myfile.txt")
    val conf = new Configuration(spark.sparkContext.hadoopConfiguration)
    conf.setInt("dfs.blocksize", 16 * 1024 * 1024) // 16MB HDFS Block Size
    val fs = path.getFileSystem(conf)
    if (fs.exists(path))
        fs.delete(path, true)
    val out = new BufferedOutputStream(fs.create(path)))
    val txt = "Some text to output"
    out.write(txt.getBytes("UTF-8"))
    out.flush()
    out.close()
    fs.close()
    

提交回复
热议问题