Difference in calling the job

后端 未结 2 1017
情话喂你
情话喂你 2021-02-01 07:31

what is the difference between calling a mapreduce job from main() and from ToolRunner.run()? When we say that the main class say, MapReduce exte

2条回答
  •  故里飘歌
    2021-02-01 08:21

    There's no extra privileges, but your command line options get run via the GenericOptionsParser, which will allow you extract certain configuration properties and configure a Configuration object from it:

    http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/util/GenericOptionsParser.html

    Basically rather that parsing some options yourself (using the index of the argument in the list), you can explicitly configure Configuration properties from the command line:

    hadoop jar myJar.jar com.Main prop1value prop2value
    
    public static void main(String args[]) {
        Configuration conf = new Configuration();
        conf.set("prop1", args[0]);
        conf.set("prop2", args[1]);
    
        conf.get("prop1"); // will resolve to "prop1Value"
        conf.get("prop2"); // will resolve to "prop2Value"
    }
    

    Becomes much more condensed with ToolRunner:

    hadoop jar myJar.jar com.Main -Dprop1=prop1value -Dprop2=prop2value
    
    public int run(String args[]) {
        Configuration conf = getConf();
    
        conf.get("prop1"); // will resolve to "prop1Value"
        conf.get("prop2"); // will resolve to "prop2Value"
    }
    

    One final word of warning though: when using the Configuration method getConf(), create your Job object first, then pull its Configuration out - the Job constructor makes a copy of the Configruation object passed in, so if you makes changes to the reference passed in, you job will not see those changes:

    public int run(String args[]) {
        Configuration conf = getConf();
    
        conf.set("prop3", "blah");
    
        Job job = new Job(conf); // job will have a deep copy of conf
    
        conf.set("prop4", "dummy"); // here we're amending the original conf
    
        job.getConfiguration().get("prop4"); // will resolve to null
    }
    

提交回复
热议问题