I have some spark code to process a csv file. It does some transformation on it. I now want to save this RDD as a csv file and add a header. Each line of this RDD is already
Some help on writing it without Union(Supplied the header at the time of merge)
val fileHeader ="This is header"
val fileHeaderStream: InputStream = new ByteArrayInputStream(fileHeader.getBytes(StandardCharsets.UTF_8));
val output = IOUtils.copyBytes(fileHeaderStream,out,conf,false)
Now loop over you file parts to write the complete file using
val in: DataInputStream = ...
IOUtils.copyBytes(in, output, conf, false)
This made sure for me that header always comes as first line even when you use coalasec/repartition for efficient writing