问题
According to http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/, the formula for determining the number of concurrently running tasks per node is:
min (yarn.nodemanager.resource.memory-mb / mapreduce.[map|reduce].memory.mb,
yarn.nodemanager.resource.cpu-vcores / mapreduce.[map|reduce].cpu.vcores) .
However, on setting these parameters to (for a cluster of c3.2xlarges):
yarn.nodemanager.resource.memory-mb = 14336
mapreduce.map.memory.mb = 2048
yarn.nodemanager.resource.cpu-vcores = 8
mapreduce.map.cpu.vcores = 1,
I find I'm only getting up to 4 tasks running concurrently per node when the formula says 7 should be. What's the deal?
I'm running Hadoop 2.4.0 on AMI 3.1.0.
回答1:
My empirical formula was incorrect. The formula provided by Cloudera is the correct one and appears to give the expected number of concurrently running tasks, at least on AMI 3.3.1.
来源:https://stackoverflow.com/questions/25193201/how-to-set-the-precise-max-number-of-concurrently-running-tasks-per-node-in-hado