问题
I have an Apache Spark application running on a YARN cluster (spark has 3 nodes on this cluster) on cluster mode.
When the application is running the Spark-UI shows that 2 executors (each running on a different node) and the driver are running on the third node. I want the application to use more executors so I tried adding the argument --num-executors to Spark-submit and set it to 6.
spark-submit --driver-memory 3G --num-executors 6 --class main.Application --executor-memory 11G --master yarn-cluster myJar.jar <arg1> <arg2> <arg3> ...
However, the number of executors remains 2.
On spark UI I can see that the parameter spark.executor.instances is 6, just as I intended, and somehow there are still only 2 executors.
I even tried setting this parameter from the code
sparkConf.set("spark.executor.instances", "6")
Again, I can see that the parameter was set to 6, but still there are only 2 executors.
Does anyone know why I couldn't increase the number of my executors?
yarn.nodemanager.resource.memory-mb is 12g in yarn-site.xml
回答1:
Increase yarn.nodemanager.resource.memory-mb in yarn-site.xml
With 12g per node you can only launch driver(3g) and 2 executors(11g).
Node1 - driver 3g (+7% overhead)
Node2 - executor1 11g (+7% overhead)
Node3 - executor2 11g (+7% overhead)
now you are requesting for executor3 of 11g and no node has 11g memory available.
for 7% overhead refer spark.yarn.executor.memoryOverhead and spark.yarn.driver.memoryOverhead in https://spark.apache.org/docs/1.2.0/running-on-yarn.html
回答2:
Note that yarn.nodemanager.resource.memory-mb is total memory that a single NodeManager can allocate across all containers on one node.
In your case, since yarn.nodemanager.resource.memory-mb = 12G, if you add up the memory allocated to all YARN containers on any single node, it cannot exceed 12G.
You have requested 11G (-executor-memory 11G) for each Spark Executor container. Though 11G is less than 12G, this still won't work. Why ?
- Because you have to account for
spark.yarn.executor.memoryOverhead, which ismin(executorMemory * 0.10, 384)(by default, unless you override it).
So, following math must hold true:
spark.executor.memory + spark.yarn.executor.memoryOverhead <= yarn.nodemanager.resource.memory-mb
See: https://spark.apache.org/docs/latest/running-on-yarn.html for latest documentation on spark.yarn.executor.memoryOverhead
Moreover, spark.executor.instances is merely a request. Spark ApplicationMaster for your application will make a request to YARN ResourceManager for number of containers = spark.executor.instances. Request will be granted by ResourceManager on NodeManager node based on:
- Resource availability on the node. YARN scheduling has its own nuances - this is a good primer on how YARN FairScheduler works.
- Whether
yarn.nodemanager.resource.memory-mbthreshold has not been exceeded on the node:- (number of spark containers running on the node * (
spark.executor.memory+spark.yarn.executor.memoryOverhead)) <=yarn.nodemanager.resource.memory-mb*
- (number of spark containers running on the node * (
If the request is not granted, request will be queued and granted when above conditions are met.
回答3:
To utilize the spark cluster to its full capacity you need to set values for --num-executors, --executor-cores and --executor-memory as per your cluster:
--num-executorscommand-line flag orspark.executor.instancesconfiguration property controls the number of executors requested ;--executor-corescommand-line flag orspark.executor.coresconfiguration property controls the number of concurrent tasks an executor can run ;--executor-memorycommand-line flag orspark.executor.memoryconfiguration property controls the heap size.
回答4:
You only have 3 nodes in the cluster, and one will be used as the driver, you have only 2 nodes left, how can you create 6 executors?
I think you confused --num-executors with --executor-cores.
To increase concurrency, you need more cores, you want to utilize all the CPUs in your cluster.
来源:https://stackoverflow.com/questions/29940711/apache-spark-setting-executor-instances-does-not-change-the-executors