I know the accumulator variables are \'write only\' from the point of view of tasks, when they are in execution in worker nodes. I was doing some testing on this and I reali
Why is it printing 0 as the value of the accumulator, when we had initiated it as 123 in the driver?
Because worker nodes will never see initial value. Only thing that is passed to workers is zero
, as defined in AccumulatorParam
. For Accumulator[Int]
it is simply 0. If you first update an accumulator you'll see updated local value:
val acc = sc.accumulator(123)
val rdd = sc.parallelize(List(1, 2, 3))
rdd.foreach(i => {acc += i; println(acc)})
It is even clearer when you use a single partition:
rdd.repartition(1).foreach(i => {acc += i; println(acc)}
Why was the exception not thrown (...)?
Because exception is thrown when you access value method, and toString is not using it at all. Instead it is using private value_
variable, the same one which is returned by value
if !deserialized
check passed.