Can I write a plain text HDFS (or local) file from a Spark program, not from an RDD?

后端 未结 4 1005
别跟我提以往
别跟我提以往 2020-12-29 13:08

I have a Spark program (in Scala) and a SparkContext. I am writing some files with RDD\'s saveAsTextFile. On my local machine I can us

4条回答
  •  醉酒成梦
    2020-12-29 13:47

    Thanks to marios and kostya, but there are few steps to writing a text file into HDFS from Spark.

    // Hadoop Config is accessible from SparkContext
    val fs = FileSystem.get(sparkContext.hadoopConfiguration); 
    
    // Output file can be created from file system.
    val output = fs.create(new Path(filename));
    
    // But BufferedOutputStream must be used to output an actual text file.
    val os = BufferedOutputStream(output)
    
    os.write("Hello World".getBytes("UTF-8"))
    
    os.close()
    

    Note that FSDataOutputStream, which has been suggested, is a Java serialized object output stream, not a text output stream. The writeUTF method appears to write plaint text, but it's actually a binary serialization format that includes extra bytes.

提交回复
热议问题