YarnException: Unauthorized request to start container

大城市里の小女人 提交于 2019-12-01 16:04:50

This exception occurs when your nodes have different time settings. Make sure that your all 3 nodes have same time n timezone settings and then restart computer.

This worked for me . Hope this help to you as well !!!!

One of the options would be increasing lifespan of container by setting

yarn.resourcemanager.rm.container-allocation.expiry-interval-ms

which is by default is 10 min

E.g.
Service-Wide / Advanced
YARN Service Configuration Safety Valve for yarn-site.xml

    <property>
       <name>yarn.resourcemanager.rm.container-allocation.expiry-interval-ms</name>
       <value>1000000</value>
    </property>

Beyond just the time settings, make sure the nodes are running NTP or are time-synced reasonably well - I had the same problem and discovered that one of the node had the wrong YEAR set in date. Once I put the times within seconds of each other, then the error went away.

If you see this error all of a sudden , then it might be due to time drifts of virtual machines.

All virtual machines can be prone to time drift.

System time can drift several minutes on long running clusters if its not synchronized to a known good time source. So, all of your cluster nodes using their own own system Time's can time drift sporadically over time.

Your Hadoop jobs may initially run successfully, because the drift may not be quite noticeable. However, on long running clusters, if one of the worker time drifted too long( when compared to master's time) that it exceeds the 10 minute interval, then the jobs fail because the YARN containers scheduled on this workers will be marked EXPIRED as soon as the AM submits it.

The key part is:

"For any container, if the corresponding NM doesn’t report to the RM that the container has started running within a configured interval of time, by default 10 minutes, the container is deemed as dead and is expired by the RM."

You can learn more about YARN Container allocation here: http://hortonworks.com/blog/apache-hadoop-yarn-resourcemanager/

So, the jobs will work if you increase the yarn.resourcemanager.rm.container-allocation.expiry-interval-ms in the yarn-site.xml config file.

But that's just a temporary workaround.


To avoid the actual issue , you need to use some synchronization mechanism like NTP.

NTP is responsible for time sync with global time servers and your Master/worker nodes.

You need to make sure the NTP daemon is up and running on all nodes of the cluster. NTP also should stay "synchronized" (ntpstat) during the entire lifecycle of the cluster. Some obvious issues that can cause NTP un-synchronized

  • Your firewall may be blocking UDP port 123.
  • You may be having AD environment with a different time sync conflicting with NTP.

http://support.ntp.org/bin/view/Support/TroubleshootingNTP

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!