I have a tiny cluster composed of 1 master (namenode, secondarynamenode, resourcemanager) and 2 slaves (datanode, nodemanager).
I have set in the yarn-site.xml of th
I will answer this question, on the assumption that the scheduler used is, CapacityScheduler.
CapacityScheduler uses ResourceCalculator for calculating the resources needed for an application. There are 2 types of resource calculators:
By default, the CapacityScheduler uses DefaultResourceCalculator. If you want to use the DominantResourceCalculator, then you need to set following property in "capacity-scheduler.xml" file:
<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
</property>
Now, to answer your questions:
If DominantResourceCalculator is used, then both memory and VCores are taken into account for calculating the number of containers
mapreduce.map.memory.mb is not an abstract value. It is taken into consideration while calculating the resources.
The DominantResourceCalculator class has a normalize() function, which normalizes the resource request, using minimumResouce (determined by config yarn.scheduler.minimum-allocation-mb), maximumresource (determined by config yarn.scheduler.maximum-allocation-mb) and a step factor (determined by config yarn.scheduler.minimum-allocation-mb).
The code for normalizing memory looks like below (Check org.apache.hadoop.yarn.util.resource.DominantResourceCalculator.java):
int normalizedMemory = Math.min(roundUp(
Math.max(r.getMemory(), minimumResource.getMemory()),
stepFactor.getMemory()),maximumResource.getMemory());
Where:
r = Requested memory
The logic works like below:
a. Take max of(requested resource and minimum resource) = max(768, 512) = 768
b. roundup(768, StepFactor) = roundUp (768, 512) == 1279 (Approximately)
Roundup does : ((768 + (512 -1)) / 512) * 512
c. min(roundup(512, stepFactor), maximumresource) = min(1279, 1024) = 1024
So finally, the allotted memory is 1024 MB, which is what you are getting.
For the sake of simplicity, you can say that roundup, increments the demand in the steps of 512 MB (which is a minimumresource)
Where as mapreduce.map.memory.mb is total memory used by the container.
Value of mapreduce.map.java.opts should be lesser than mapreduce.map.memory.mb
The answer here explains that: What is the relation between 'mapreduce.map.memory.mb' and 'mapred.map.child.java.opts' in Apache Hadoop YARN?
When you use DominantResourceCalculator, it uses normalize() function to calculate vCores needed.
The code for that is (similar to normalization of memory):
int normalizedCores = Math.min(roundUp
` Math.max(r.getVirtualCores(), minimumResource.getVirtualCores()),
stepFactor.getVirtualCores()), maximumResource.getVirtualCores());