I\'m programming with spark streaming but have some trouble with scala. I\'m trying to use the function StreamingContext.fileStream
The definition of this function i
If you want to use fileStream, you're going to have to supply all 3 type params to it when calling it. You need to know what your Key, Value and InputFormat types are before calling it. If your types were LongWritable, Text and TextInputFormat, you would call fileStream like so:
val lines = ssc.fileStream[LongWritable, Text, TextInputFormat]("/home/sequenceFile")
If those 3 types do happen to be your types, then you might want to use textFileStream instead as it does not require any type params and delegates to fileStream using those 3 types I mentioned. Using that would look like this:
val lines = ssc.textFileStream("/home/sequenceFile")
val filterF = new Function[Path, Boolean] {
def apply(x: Path): Boolean = {
val flag = if(x.toString.split("/").last.split("_").last.toLong < System.currentTimeMillis) true else false
return flag
}
}
val streamed_rdd = ssc.fileStream[LongWritable, Text, TextInputFormat]("/user/hdpprod/temp/spark_streaming_input",filterF,false).map(_._2.toString).map(u => u.split('\t'))