How to set the precise max number of concurrently running tasks per node in Hadoop 2.4.0 on Elastic MapReduce

二次信任 提交于 2019-11-28 03:30:46

问题


According to http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/, the formula for determining the number of concurrently running tasks per node is:

min (yarn.nodemanager.resource.memory-mb / mapreduce.[map|reduce].memory.mb, 
     yarn.nodemanager.resource.cpu-vcores / mapreduce.[map|reduce].cpu.vcores) .

However, on setting these parameters to (for a cluster of c3.2xlarges):

yarn.nodemanager.resource.memory-mb = 14336

mapreduce.map.memory.mb = 2048

yarn.nodemanager.resource.cpu-vcores = 8

mapreduce.map.cpu.vcores = 1,

I find I'm only getting up to 4 tasks running concurrently per node when the formula says 7 should be. What's the deal?

I'm running Hadoop 2.4.0 on AMI 3.1.0.


回答1:


My empirical formula was incorrect. The formula provided by Cloudera is the correct one and appears to give the expected number of concurrently running tasks, at least on AMI 3.3.1.



来源:https://stackoverflow.com/questions/25193201/how-to-set-the-precise-max-number-of-concurrently-running-tasks-per-node-in-hado

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!