I have one csv file in a folder that is keep on updating continuously. I need to take inputs from this csv file and produce some transactions. How can I take data from the c
Firstly, I'm not sure how you arrive here, because a csv file should be written sequencially, which is able to achieve a better Input/Output. So my recommendation is that you create an append-only file, and try to get the stream data like getting data from binlog.
However if you have to do this, I think StreamingContext may help you.
val ssc = new StreamingContext(new SparkConf(), Durations.milliseconds(1))
val fileStream = ssc.fileStream[LongWritable, Text, TextInputFormat]("/tmp", (x: Path) => true, newFilesOnly = false).map(_._2.toString)