I run my spark application in yarn cluster. In my code I use number available cores of queue for creating partitions on my dataset:
Dataset ds = ... ds.coale
Found this while looking for the answer to pretty much the same question.
I found that:
Dataset ds = ... ds.coalesce(sc.defaultParallelism());
does exactly what the OP was looking for.
For example, my 5 node x 8 core cluster returns 40 for the defaultParallelism.
defaultParallelism