I have a spark pair RDD (key, count) as below
Array[(String, Int)] = Array((a,1), (b,2), (c,1), (d,3))
How to find the key with highest co
Use takeOrdered(1)(Ordering[Int].reverse.on(_._2))
:
val a = Array(("a",1), ("b",2), ("c",1), ("d",3))
val rdd = sc.parallelize(a)
val maxKey = rdd.takeOrdered(1)(Ordering[Int].reverse.on(_._2))
// maxKey: Array[(String, Int)] = Array((d,3))
Quoting the note from RDD.takeOrdered:
This method should only be used if the resulting array is expected to be small, as all the data is loaded into the driver's memory.