I run my spark application in yarn cluster. In my code I use number available cores of queue for creating partitions on my dataset:
Dataset ds = ...
ds.coale
You could run jobs on every machine and ask it for the number of cores, but that's not necessarily what's available for Spark (as pointed out by @tribbloid in a comment on another answer):
import spark.implicits._
import scala.collection.JavaConverters._
import sys.process._
val procs = (1 to 1000).toDF.map(_ => "hostname".!!.trim -> java.lang.Runtime.getRuntime.availableProcessors).collectAsList().asScala.toMap
val nCpus = procs.values.sum
Running it in the shell (on a tiny test cluster with two workers) gives:
scala> :paste
// Entering paste mode (ctrl-D to finish)
import spark.implicits._
import scala.collection.JavaConverters._
import sys.process._
val procs = (1 to 1000).toDF.map(_ => "hostname".!!.trim -> java.lang.Runtime.getRuntime.availableProcessors).collectAsList().asScala.toMap
val nCpus = procs.values.sum
// Exiting paste mode, now interpreting.
import spark.implicits._
import scala.collection.JavaConverters._
import sys.process._
procs: scala.collection.immutable.Map[String,Int] = Map(ip-172-31-76-201.ec2.internal -> 2, ip-172-31-74-242.ec2.internal -> 2)
nCpus: Int = 4
Add zeros to your range if you typically have lots of machines in your cluster. Even on my two-machine cluster 10000 completes in a couple seconds.
This is probably only useful if you want more information than sc.defaultParallelism() will give you (as in @SteveC 's answer)