hadoop only launch local job by default why?

后端 未结 3 2056
孤街浪徒
孤街浪徒 2021-01-05 12:44

I have written my own hadoop program and I can run using pseudo distribute mode in my own laptop, however, when I put the program in the cluster which can run example jar of

相关标签:
3条回答
  • 2021-01-05 13:34

    If you're using Hadoop 2 and your job is running locally instead of on the cluster, ensure that you have setup mapred-site.xml to contain the mapreduce.framework.name property with a value of yarn. You also need to set up an aux-service in yarn-site.xml

    Checkout the Cloudera Hadoop 2 operator migration blog for more information.

    0 讨论(0)
  • 2021-01-05 13:36

    I had the same problem that every mapreduce v2 (mrv2) or yarn task only ran with the mapred.LocalJobRunner

    INFO mapred.LocalJobRunner: Starting task: attempt_local284299729_0001_m_000000_0
    

    The Resourcemanager and Nodemanagers were accessible and the mapreduce.framework.name was set to yarn.

    Setting the HADOOP_MAPRED_HOME before executing the job fixed the problem for me.

    export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce
    

    cheers dan

    0 讨论(0)
  • 2021-01-05 13:42

    LocalJobRunner has been chosen as your configuration most probably has the mapred.job.tracker property set to local or has not been set at all (in which case the default is local). To check, go to "wherever you extracted/installed hadoop"/etc/hadoop/ and see if the file mapred-site.xml exists (for me it did not, a file called mapped-site.xml.template was there). In that file (or create it if it doesn't exist) make sure it has the following property:

    <configuration>
    <property>  
     <name>mapreduce.framework.name</name>  
     <value>yarn</value>  
     </property>
    </configuration>
    
    • See the source for org.apache.hadoop.mapred.JobClient.init(JobConf)

    What is the value of this configuration property in the hadoop configuration on the machine you are submitting this from? Also confirm that the hadoop executable you are running references this configuration (and that you don't have 2+ installations configured differently) - type which hadoop and trace any symlinks you come across.

    Alternatively you can override this when you submit your job, if you know the JobTracker host and port number using the -jt option:

    hadoop jar MyRandomForest_oob_distance.jar -jt hostname:port hdfs://montana-01:8020/user/randomforest/input/genotype1.txt hdfs://montana-01:8020/user/randomforest/input/phenotype1.txt hdfs://montana-01:8020/user/randomforest/output1_distance/ hdfs://montana-01:8020/user/randomforest/input/genotype101.txt hdfs://montana-01:8020/user/randomforest/input/phenotype101.txt 33 500 1
    
    0 讨论(0)
提交回复
热议问题