Spark Accumulator value not read by task

三世轮回 提交于 2019-12-06 12:00:19

问题


I am initializing an accumulator

final Accumulator<Integer> accum = sc.accumulator(0);

And then while in map function , I'm trying to increment the accumulator , then using the accumulator value in setting a variable.

JavaRDD<UserSetGet> UserProfileRDD1 = temp.map(new Function<String, UserSetGet>() {

            @Override
            public UserSetGet call(String arg0) throws Exception {

                    UserSetGet usg = new UserSetGet();

                    accum.add(1);
                    usg.setPid(accum.value().toString();


            }
  });

But Im getting the following error.

16/03/14 09:12:58 ERROR executor.Executor: Exception in task 0.0 in stage 2.0 (TID 2) java.lang.UnsupportedOperationException: Can't read accumulator value in task

EDITED - As per the answer from Avihoo Mamka, getting accumulator value in tasks is not possible.

So is there anyway I can achieve the same in parallel. Such that the Pid value gets set each time a variable(eg like static variable) is incremented in my map function.


回答1:


From the Spark docs

Accumulators are variables that are only “added” to through an associative operation and can therefore be efficiently supported in parallel. They can be used to implement counters (as in MapReduce) or sums

...

Only the driver program can read the accumulator’s value, using its value method.

Therefore, when trying to read the accumulator's value from within a task in Spark, means that you try to read its value from a worker, which is against the concept of reading the accumulator value only from the driver.



来源:https://stackoverflow.com/questions/35983824/spark-accumulator-value-not-read-by-task

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!