reduceByKey method not being found in Scala Spark

喜你入骨 提交于 2019-11-28 02:27:07

问题


Attempting to run http://spark.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala from source.

This line:

val wordCounts = textFile.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey((a, b) => a + b)

is throwing error

value reduceByKey is not a member of org.apache.spark.rdd.RDD[(String, Int)]
  val wordCounts = logData.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey((a, b) => a + b)

logData.flatMap(line => line.split(" ")).map(word => (word, 1)) returns a MappedRDD but I cannot find this type in http://spark.apache.org/docs/0.9.1/api/core/index.html#org.apache.spark.rdd.RDD

I'm running this code from Spark source so could be a classpath problem ? But required dependencies are on my classpath.


回答1:


You should import the implicit conversions from SparkContext:

import org.apache.spark.SparkContext._

They use the 'pimp up my library' pattern to add methods to RDD's of specific types. If curious, see SparkContext:1296




回答2:


If you use maven on ScalaIDE I just solved the problem by updating the dependency from spark-streaming version 1.2 to version 1.3.




回答3:


Actually, you can find it in PairRDDFunctions class. PairRDDFunctions is a class contains extra functions available on RDDs of (key, value) pairs through an implicit conversion.

https://spark.apache.org/docs/2.1.0/api/scala/index.html#org.apache.spark.rdd.PairRDDFunctions



来源:https://stackoverflow.com/questions/23943852/reducebykey-method-not-being-found-in-scala-spark

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!