Spark on YARN resource manager: Relation between YARN Containers and Spark Executors
问题 I'm new to Spark on YARN and don't understand the relation between the YARN Containers and the Spark Executors . I tried out the following configuration, based on the results of the yarn-utils.py script, that can be used to find optimal cluster configuration. The Hadoop cluster (HDP 2.4) I'm working on: 1 Master Node: CPU: 2 CPUs with 6 cores each = 12 cores RAM: 64 GB SSD: 2 x 512 GB 5 Slave Nodes: CPU: 2 CPUs with 6 cores each = 12 cores RAM: 64 GB HDD: 4 x 3 TB = 12 TB HBase is installed