Spark - How many Executors and Cores are allocated to my spark job

北城余情 提交于 2019-11-29 07:58:32
Ram Ghadiyaram

Scala (Programmatic way) :

getExecutorStorageStatus and getExecutorMemoryStatus both return the number of executors including driver. like below example snippet.

/** Method that just returns the current active/registered executors
        * excluding the driver.
        * @param sc The spark context to retrieve registered executors.
        * @return a list of executors each in the form of host:port.
        */
       def currentActiveExecutors(sc: SparkContext): Seq[String] = {
         val allExecutors = sc.getExecutorMemoryStatus.map(_._1)
         val driverHost: String = sc.getConf.get("spark.driver.host")
         allExecutors.filter(! _.split(":")(0).equals(driverHost)).toList
       }

sc.getConf.getInt("spark.executor.instances", 1)

similarly get all properties and print like below you may get cores information as well..

sc.getConf.getAll.mkString("\n")

OR

sc.getConf.toDebugString

Mostly spark.executor.cores for executors spark.driver.cores driver should have this value.

Python :

Above methods getExecutorStorageStatus and getExecutorMemoryStatus, In python api were not implemented

EDIT But can be accessed using Py4J bindings exposed from SparkSession.

sc._jsc.sc().getExecutorMemoryStatus()

This is an old question, but this is my code for figuring this out on Spark 2.3.0:

+ 414     executor_count = len(spark.sparkContext._jsc.sc().statusTracker().getExecutorInfos()) - 1
+ 415     cores_per_executor = int(spark.sparkContext.getConf().get('spark.executor.cores','1'))

This is python Example to get number of cores (including master's) def workername(): import socket return str(socket.gethostname()) anrdd=sc.parallelize(['','']) namesRDD = anrdd.flatMap(lambda e: (1,workername())) namesRDD.count()

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!