I have a RDD that is generated using Spark. Now if I write this RDD to a csv file, I am provided with some methods like "saveAsTextFile()" which outputs a csv file to the HDFS.
I want to write the file to my local file system so that my SSIS process can pick the files from the system and load them into the DB.
I am currently unable to use sqoop.
Is it somewhere possible in Java other than writing shell scripts to do that.
Any clarity needed, please let know.
saveAsTextFile
is able to take in local file system paths (e.g. file:///tmp/magic/...
). However, if your running on a distributed cluster, you most likely want to collect()
the data back to the cluster and then save it with standard file operations.
来源:https://stackoverflow.com/questions/31239161/save-a-spark-rdd-to-the-local-file-system-using-java