Scala Parallel Collections: How to know and configure the number of threads

牧云@^-^@ 提交于 2020-01-16 04:11:07

问题


I am using scala parallel collections.

val largeList = list.par.map(x => largeComputation(x)).toList

It is blazing fast, but I have a feeling that I may run into out-of-memory issues if we run too may "largeComputation" in parallel.

Therefore when testing, I would like to know how many threads is the parallel collection using and if-need-be, how can I configure the number of threads for the parallel collections.


回答1:


Here is a piece of scaladoc where they explain how to change the task support and wrap inside it the ForkJoinPool. When you instantiate the ForkJoinPool you pass as the parameter desired parallelism level:

Here is a way to change the task support of a parallel collection:

import scala.collection.parallel._
val pc = mutable.ParArray(1, 2, 3)
pc.tasksupport = new ForkJoinTaskSupport(new scala.concurrent.forkjoin.ForkJoinPool(2))

So for your case it will be

val largeList = list.par
largerList.tasksupport = new ForkJoinTaskSupport(
  new scala.concurrent.forkjoin.ForkJoinPool(x)
)
largerList.map(x => largeComputation(x)).toList


来源:https://stackoverflow.com/questions/38701877/scala-parallel-collections-how-to-know-and-configure-the-number-of-threads

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!