问题
I am using scala parallel collections.
val largeList = list.par.map(x => largeComputation(x)).toList
It is blazing fast, but I have a feeling that I may run into out-of-memory issues if we run too may "largeComputation" in parallel.
Therefore when testing, I would like to know how many threads is the parallel collection using and if-need-be, how can I configure the number of threads for the parallel collections.
回答1:
Here is a piece of scaladoc where they explain how to change the task support and wrap inside it the ForkJoinPool
. When you instantiate the ForkJoinPool
you pass as the parameter desired parallelism level:
Here is a way to change the task support of a parallel collection:
import scala.collection.parallel._
val pc = mutable.ParArray(1, 2, 3)
pc.tasksupport = new ForkJoinTaskSupport(new scala.concurrent.forkjoin.ForkJoinPool(2))
So for your case it will be
val largeList = list.par
largerList.tasksupport = new ForkJoinTaskSupport(
new scala.concurrent.forkjoin.ForkJoinPool(x)
)
largerList.map(x => largeComputation(x)).toList
来源:https://stackoverflow.com/questions/38701877/scala-parallel-collections-how-to-know-and-configure-the-number-of-threads