Difference in calling the job

后端未结
关注
 2  1030
情话喂你 2021-02-01 07:31
what is the difference between calling a mapreduce job from main() and from ToolRunner.run()? When we say that the main class say, MapReduce exte

      
      
        
          2条回答        

        
                    
            
            
                         
                
              
              
                
                   故里飘歌
                                             
                
                
                (楼主)
            
              
              
                2021-02-01 08:21
              

            
            
                        
There's no extra privileges, but your command line options get run via the GenericOptionsParser, which will allow you extract certain configuration properties and configure a Configuration object from it:

http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/util/GenericOptionsParser.html

Basically rather that parsing some options yourself (using the index of the argument in the list), you can explicitly configure Configuration properties from the command line:

hadoop jar myJar.jar com.Main prop1value prop2value

public static void main(String args[]) {
    Configuration conf = new Configuration();
    conf.set("prop1", args[0]);
    conf.set("prop2", args[1]);

    conf.get("prop1"); // will resolve to "prop1Value"
    conf.get("prop2"); // will resolve to "prop2Value"
}


Becomes much more condensed with ToolRunner:

hadoop jar myJar.jar com.Main -Dprop1=prop1value -Dprop2=prop2value

public int run(String args[]) {
    Configuration conf = getConf();

    conf.get("prop1"); // will resolve to "prop1Value"
    conf.get("prop2"); // will resolve to "prop2Value"
}


One final word of warning though: when using the Configuration method getConf(), create your Job object first, then pull its Configuration out - the Job constructor makes a copy of the Configruation object passed in, so if you makes changes to the reference passed in, you job will not see those changes:

public int run(String args[]) {
    Configuration conf = getConf();

    conf.set("prop3", "blah");

    Job job = new Job(conf); // job will have a deep copy of conf

    conf.set("prop4", "dummy"); // here we're amending the original conf

    job.getConfiguration().get("prop4"); // will resolve to null
}

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它2个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复