MapReduce jobs get stuck in Accepted state

后端 未结 6 2169
故里飘歌
故里飘歌 2020-12-24 07:39

I have my own MapReduce code that I\'m trying to run, but it just stays at Accepted state. I tried running another sample MR job that I\'d run previously and which was succe

相关标签:
6条回答
  • 2020-12-24 08:06

    A job stuck in accepted state on YARN is usually because of free resources are not enough. You can check it at http://resourcemanager:port/cluster/scheduler:

    1. if Memory Used + Memory Reserved >= Memory Total, memory is not enough
    2. if VCores Used + VCores Reserved >= VCores Total, VCores is not enough

    It may also be limited by parameters such as maxAMShare.

    0 讨论(0)
  • 2020-12-24 08:10

    Adding the property yarn.resourcemanager.hostname to the master node hostname in yarn-site.xml and copy this file to all the nodes in the cluster to reflect this configuration has solved the issue for me.

    0 讨论(0)
  • 2020-12-24 08:18

    I faced the same issue. And i changed every configuration mentioned in above answers but still it was no use. After this, i re-checked the health of my cluster. There, i observed that my one and only node was in un-healthy state. The issue was due to lack of disk space in my /tmp/hadoop-hadoopUser/nm-local-dir directory. Same can be checked by checking node health status at resource manager web UI at port 8032. To resolve this, i added below property in yarn-site.xml.

    <property>
        <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
        <value>98.5</value>
    </property>
    

    After restarting my hadoop daemons, node status got changed to healthy and jobs started to run

    0 讨论(0)
  • 2020-12-24 08:19

    Am using Hadoop 3.0.1.I had faced the same issue where-in submitted map reduce job were shown as stuck in ACCEPTED state in ResourceManager web UI.Also, in the same ResourceManager web UI,under Cluster metrics -> Memory used was 0, Total Memory was 0; Cluster Node Metrics -> Active Nodes was 0, although NamedNode web UI listed the data nodes perfectly.Running yarn node -list on the cluster did not display any NodeManagers.Turns out, that my NodeManagers were not running.After starting the NodeManagers,the newly submitted map reduce jobs could proceed further.They were no more stuck in ACCEPTED state, and got to "RUNNING" state

    0 讨论(0)
  • 2020-12-24 08:24

    I've had the same effect and found that making the system have more memory available per worker node and reduce the memory required for an application helped.

    The settings I have (on my very small experimental boxes) in my yarn-site.xml:

    <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>2200</value>
      <description>Amount of physical memory, in MB, that can be allocated for containers.</description>
    </property>
    
    <property>
      <name>yarn.scheduler.minimum-allocation-mb</name>
      <value>500</value>
    </property>
    
    0 讨论(0)
  • 2020-12-24 08:32

    Had the same issue, and for me it was a full hard drive (>90% full) which was the issue. Cleaning space saved me.

    0 讨论(0)
提交回复
热议问题