Spark Task not Serializable with simple accumulator?

≯℡__Kan透↙ 提交于 2019-12-12 16:16:04

问题


I am running this simple code:

val accum = sc.accumulator(0, "Progress");
listFilesPar.foreach {
  filepath =>
    accum += 1
}

listFilesPar is an RDD[String]

which throws the following error:

org.apache.spark.SparkException: Task not serializable

Right now I don't understand what's happening and I don't put parenthesis but brackets because I need to write a lengthy function. I am just doing unit testing


回答1:


The typical cause of this is that the closure unexpectedly captures something. Something that you did not include in your paste, because you would never expect it would be serialized.

You can try to reduce your code until you find it. Or just turn on serialization debug logging with -Dsun.io.serialization.extendedDebugInfo=true. You will probably see in the output that Spark tries to serialize something silly.



来源:https://stackoverflow.com/questions/27980781/spark-task-not-serializable-with-simple-accumulator

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!