Map reduce job getting stuck at map 0% reduce 0%

ⅰ亾dé卋堺 提交于 2019-12-19 08:05:49

问题


I am running the famous wordcount example. I have a local and prod hadoop setup. The same example is working in prod, but its not working locally. Can someone tell me what should I look for. The job is getting stuck. The task logs are:

~/tmp$ hadoop jar wordcount.jar WordCount /testhistory /outputtest/test
Warning: $HADOOP_HOME is deprecated.

13/08/29 16:12:34 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/08/29 16:12:35 INFO input.FileInputFormat: Total input paths to process : 3
13/08/29 16:12:35 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/08/29 16:12:35 WARN snappy.LoadSnappy: Snappy native library not loaded
13/08/29 16:12:35 INFO mapred.JobClient: Running job: job_201308291153_0015
13/08/29 16:12:36 INFO mapred.JobClient:  map 0% reduce 0%

Locally hadoop in running as pseudo distributed mode. All the 3 processes, namenode, datanode, jobtracker is running. Let me know if some extra information is required.


回答1:


The tasktracker seems to be missing.

Try:

hadoop tasktracker &



回答2:


In Hadoop 2.x this problem could be related to memory issues, you can see it in MapReduce in Hadoop 2.2.0 not working




回答3:


I had the same problem and this page helped me: http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide/

Basically I solved my problem using the following 3 steps. The fact is that I had to configure much more memory I really have.

1) yarn-site.xml

  • yarn.resourcemanager.hostname = hostname_of_the_master
  • yarn.nodemanager.resource.memory-mb = 4000
  • yarn.nodemanager.resource.cpu-vcores = 2
  • yarn.scheduler.minimum-allocation-mb = 4000

2) mapred-site.xml

  • yarn.app.mapreduce.am.resource.mb = 4000
  • yarn.app.mapreduce.am.command-opts = -Xmx3768m
  • mapreduce.map.cpu.vcores = 2
  • mapreduce.reduce.cpu.vcores = 2

3) Send these files across all nodes




回答4:


Except for hadoop tasktracker & and any other issues. Please check you code and make sure that there is no infinite loop or any other bugs. Maybe there are some bugs in your code!




回答5:


If this problem is coming when using Hive queries then do check if you are joining two very big tables without leveraging partitions. Not using partitions may lead to long running full table scans and hence stuck at map 0% reduce 0%.



来源:https://stackoverflow.com/questions/18525354/map-reduce-job-getting-stuck-at-map-0-reduce-0

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!