hadoop only launch local job by default why?

后端未结

关注

 3  2065

I have written my own hadoop program and I can run using pseudo distribute mode in my own laptop, however, when I put the program in the cluster which can run example jar of

相关标签:

3条回答

旧时难觅i

2021-01-05 13:34

If you're using Hadoop 2 and your job is running locally instead of on the cluster, ensure that you have setup mapred-site.xml to contain the mapreduce.framework.name property with a value of yarn. You also need to set up an aux-service in yarn-site.xml

Checkout the Cloudera Hadoop 2 operator migration blog for more information.

0 讨论(0)
发布评论:

提交评论
- 加载中...
梦毁少年i

2021-01-05 13:36
I had the same problem that every mapreduce v2 (mrv2) or yarn task only ran with the mapred.LocalJobRunner
```
INFO mapred.LocalJobRunner: Starting task: attempt_local284299729_0001_m_000000_0
```
The Resourcemanager and Nodemanagers were accessible and the mapreduce.framework.name was set to yarn.

Setting the HADOOP_MAPRED_HOME before executing the job fixed the problem for me.
```
export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce
```
cheers dan
0 讨论(0)
发布评论:

提交评论
- 加载中...
谎友^

2021-01-05 13:42
LocalJobRunner has been chosen as your configuration most probably has the mapred.job.tracker property set to local or has not been set at all (in which case the default is local). To check, go to "wherever you extracted/installed hadoop"/etc/hadoop/ and see if the file mapred-site.xml exists (for me it did not, a file called mapped-site.xml.template was there). In that file (or create it if it doesn't exist) make sure it has the following property:
```
<configuration>
<property>  
 <name>mapreduce.framework.name</name>  
 <value>yarn</value>  
 </property>
</configuration>
```
- See the source for org.apache.hadoop.mapred.JobClient.init(JobConf)
What is the value of this configuration property in the hadoop configuration on the machine you are submitting this from? Also confirm that the hadoop executable you are running references this configuration (and that you don't have 2+ installations configured differently) - type which hadoop and trace any symlinks you come across.

Alternatively you can override this when you submit your job, if you know the JobTracker host and port number using the -jt option:
```
hadoop jar MyRandomForest_oob_distance.jar -jt hostname:port hdfs://montana-01:8020/user/randomforest/input/genotype1.txt hdfs://montana-01:8020/user/randomforest/input/phenotype1.txt hdfs://montana-01:8020/user/randomforest/output1_distance/ hdfs://montana-01:8020/user/randomforest/input/genotype101.txt hdfs://montana-01:8020/user/randomforest/input/phenotype101.txt 33 500 1
```
0 讨论(0)
发布评论:

提交评论
- 加载中...