问题
Yarn resource manager is not showing the total cores for the spark application. Lets say if submit a spark job with 300 executors and executor-cores as 3. So the total cores the spark job is taking is 900 but in yarn resource manager it only shows as 300.
So is this just a display error or is Yarn not seeing the rest of the 600 cores?
Environment: HDP2.2 Scheduler : capacity-scheduler Spark : 1.4.1
回答1:
Set
yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
in capacity-scheduler.xml
YARN is running more containers than allocated cores because by default DefaultResourceCalculator is used. It considers only memory.
public int computeAvailableContainers(Resource available, Resource required) {
// Only consider memory
return available.getMemory() / required.getMemory();
}
Use DominantResourceCalculator, It uses both cpu and memory.
you can read more about DominantResourceCalculator here.
来源:https://stackoverflow.com/questions/32233162/spark-executor-cores-not-shown-in-yarn-resource-manager