How to get the specified output without combineByKey and aggregateByKey in spark RDD
问题 Below is my data: val keysWithValuesList = Array("foo=A", "foo=A", "foo=A", "foo=A", "foo=B", bar=C","bar=D", "bar=D") Now I want below types of output but without using combineByKey and aggregateByKey : 1) Array[(String, Int)] = Array((foo,5), (bar,3)) 2) Array((foo,Set(B, A)), (bar,Set(C, D))) Below is my attempt: scala> val keysWithValuesList = Array("foo=A", "foo=A", "foo=A", "foo=A", "foo=B", "bar=C", | "bar=D", "bar=D") scala> val sample=keysWithValuesList.map(_.split("=")).map(p=>(p(0)