I need to read a different file in every map() ,the file is in HDFS
val rdd=sc.parallelize(1 to 10000)
val rdd2=rdd.map{x=>
val hdfs = org.apache.
In your case, I recommend the use of wholeTextFiles
method wich will return pairRdd with the key is the file full path, and the value is the content of the file in string.
val filesPariRDD = sc.wholeTextFiles("hdfs://ITS-Hadoop10:9000/")
val filesLineCount = filesPariRDD.map( x => (x._1, x._2.length ) ) //this will return a map of fileName , number of lines of each file. You could apply any other function on the file contents
filesLineCount.collect()
Edit
If your files are in directories which are under the same directory ( as mentioned in comments)you could use some kind of regular expression
val filesPariRDD = sc.wholeTextFiles("hdfs://ITS-Hadoop10:9000/*/")
Hope this is clear and helpful