Add a header before text file on save in Spark

前端 未结 5 783
感动是毒
感动是毒 2020-12-18 22:35

I have some spark code to process a csv file. It does some transformation on it. I now want to save this RDD as a csv file and add a header. Each line of this RDD is already

5条回答
  •  自闭症患者
    2020-12-18 22:53

    Some help on writing it without Union(Supplied the header at the time of merge)

    val fileHeader ="This is header"
    val fileHeaderStream: InputStream = new  ByteArrayInputStream(fileHeader.getBytes(StandardCharsets.UTF_8));
    val output = IOUtils.copyBytes(fileHeaderStream,out,conf,false)
    

    Now loop over you file parts to write the complete file using

    val in: DataInputStream = ...
     IOUtils.copyBytes(in, output, conf, false)
    

    This made sure for me that header always comes as first line even when you use coalasec/repartition for efficient writing

提交回复
热议问题