Spark - How many Executors and Cores are allocated to my spark job

前端 未结 3 1642
故里飘歌
故里飘歌 2020-12-18 06:49

Spark architecture is entirely revolves around the concept of executors and cores. I would like to see practically how many executors and cores running for my spark applicat

相关标签:
3条回答
  • 2020-12-18 07:30

    This is an old question, but this is my code for figuring this out on Spark 2.3.0:

    + 414     executor_count = len(spark.sparkContext._jsc.sc().statusTracker().getExecutorInfos()) - 1
    + 415     cores_per_executor = int(spark.sparkContext.getConf().get('spark.executor.cores','1'))
    
    0 讨论(0)
  • 2020-12-18 07:38

    This is python Example to get number of cores (including master's) def workername(): import socket return str(socket.gethostname()) anrdd=sc.parallelize(['','']) namesRDD = anrdd.flatMap(lambda e: (1,workername())) namesRDD.count()

    0 讨论(0)
  • 2020-12-18 07:41

    Scala (Programmatic way) :

    getExecutorStorageStatus and getExecutorMemoryStatus both return the number of executors including driver. like below example snippet.

    /** Method that just returns the current active/registered executors
            * excluding the driver.
            * @param sc The spark context to retrieve registered executors.
            * @return a list of executors each in the form of host:port.
            */
           def currentActiveExecutors(sc: SparkContext): Seq[String] = {
             val allExecutors = sc.getExecutorMemoryStatus.map(_._1)
             val driverHost: String = sc.getConf.get("spark.driver.host")
             allExecutors.filter(! _.split(":")(0).equals(driverHost)).toList
           }
    
    sc.getConf.getInt("spark.executor.instances", 1)
    

    similarly get all properties and print like below you may get cores information as well..

    sc.getConf.getAll.mkString("\n")
    

    OR

    sc.getConf.toDebugString
    

    Mostly spark.executor.cores for executors spark.driver.cores driver should have this value.

    Python :

    Above methods getExecutorStorageStatus and getExecutorMemoryStatus, In python api were not implemented

    EDIT But can be accessed using Py4J bindings exposed from SparkSession.

    sc._jsc.sc().getExecutorMemoryStatus()

    0 讨论(0)
提交回复
热议问题