Iterate over lines in a file in parallel (Scala)?

后端 未结 5 1557
遥遥无期
遥遥无期 2020-12-12 22:33

I know about the parallel collections in Scala. They are handy! However, I would like to iterate over the lines of a file that is too large for memory in parallel. I coul

5条回答
  •  甜味超标
    2020-12-12 23:19

    I realize this is an old question, but you may find the ParIterator implementation in the iterata library to be a useful no-assembly-required implementation of this:

    scala> import com.timgroup.iterata.ParIterator.Implicits._
    scala> val it = (1 to 100000).toIterator.par().map(n => (n + 1, Thread.currentThread.getId))
    scala> it.map(_._2).toSet.size
    res2: Int = 8 // addition was distributed over 8 threads
    

提交回复
热议问题