hadoop - map reduce task and static variable

戏子无情 提交于 2019-12-18 07:06:06

问题


I just started working on some hadoop/hbase MapReduce job (using cloudera) and I have the following question :

Let's say, we have a java class with a main and a static viariable. That class define inner class corresponding to the Mapper and Reducer tasks. Before lauching the job, the main initialize the static variable. This variable is read in the Mapper class. The class is then launched using 'hadoop jar' on a cluster.

My question: I don't see how Map and Reduce tasks on other nodes can see that static variable. Is there any "hadoop magic" that allow nodes to share a jvm or static variables ? How can this even work ? I have to work on a class doing just that, and I can't figure out how this is ok in a non-mononode cluster. Thank you


回答1:


In a distributed Hadoop cluster each Map/Reduce task runs in it's own separate JVM. So there's no way to share static variable between different class instances running on different JVMs (and even on different nodes).

But if you want to share some immutable data between tasks, you can use Configuration class:

// driver code
Configuration config = Configuration.create();
config.setLong("foo.bar.somelong",1337);
...

// mapper code
public class SomeMapper ... {
    private long someLong = 0;
    public void setup(Context context) {
        Configuration config = context.getConfiguration();
        someLong = config.getLong("foo.bar.somelong");
    }
}


来源:https://stackoverflow.com/questions/24280415/hadoop-map-reduce-task-and-static-variable

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!