MapReduce jobs get stuck in Accepted state

后端未结

关注

 6  2180

I have my own MapReduce code that I\'m trying to run, but it just stays at Accepted state. I tried running another sample MR job that I\'d run previously and which was succe

相关标签:

6条回答

没有蜡笔的小新

2020-12-24 08:06
A job stuck in accepted state on YARN is usually because of free resources are not enough. You can check it at http://resourcemanager:port/cluster/scheduler:
1. if Memory Used + Memory Reserved >= Memory Total, memory is not enough
2. if VCores Used + VCores Reserved >= VCores Total, VCores is not enough
It may also be limited by parameters such as maxAMShare.
0 讨论(0)
发布评论:

提交评论
- 加载中...
终归单人心

2020-12-24 08:10

Adding the property yarn.resourcemanager.hostname to the master node hostname in yarn-site.xml and copy this file to all the nodes in the cluster to reflect this configuration has solved the issue for me.

0 讨论(0)
发布评论:

提交评论
- 加载中...
故里飘歌

2020-12-24 08:18
I faced the same issue. And i changed every configuration mentioned in above answers but still it was no use. After this, i re-checked the health of my cluster. There, i observed that my one and only node was in un-healthy state. The issue was due to lack of disk space in my /tmp/hadoop-hadoopUser/nm-local-dir directory. Same can be checked by checking node health status at resource manager web UI at port 8032. To resolve this, i added below property in yarn-site.xml.
```
<property>
    <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
    <value>98.5</value>
</property>
```
After restarting my hadoop daemons, node status got changed to healthy and jobs started to run
0 讨论(0)
发布评论:

提交评论
- 加载中...
终归单人心

2020-12-24 08:19

Am using Hadoop 3.0.1.I had faced the same issue where-in submitted map reduce job were shown as stuck in ACCEPTED state in ResourceManager web UI.Also, in the same ResourceManager web UI,under Cluster metrics -> Memory used was 0, Total Memory was 0; Cluster Node Metrics -> Active Nodes was 0, although NamedNode web UI listed the data nodes perfectly.Running yarn node -list on the cluster did not display any NodeManagers.Turns out, that my NodeManagers were not running.After starting the NodeManagers,the newly submitted map reduce jobs could proceed further.They were no more stuck in ACCEPTED state, and got to "RUNNING" state

0 讨论(0)
发布评论:

提交评论
- 加载中...
长发绾君心

2020-12-24 08:24
I've had the same effect and found that making the system have more memory available per worker node and reduce the memory required for an application helped.

The settings I have (on my very small experimental boxes) in my yarn-site.xml:
```
<property>
  <name>yarn.nodemanager.resource.memory-mb</name>
  <value>2200</value>
  <description>Amount of physical memory, in MB, that can be allocated for containers.</description>
</property>

<property>
  <name>yarn.scheduler.minimum-allocation-mb</name>
  <value>500</value>
</property>
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
眼角桃花

2020-12-24 08:32

Had the same issue, and for me it was a full hard drive (>90% full) which was the issue. Cleaning space saved me.

0 讨论(0)
发布评论:

提交评论
- 加载中...