Efficiency/scalability of parallel collections in Scala (graphs)

问题

So I've been working with parallel collections in Scala for a graph project I'm working on, I've got the basics of the graph class defined, it is currently using a scala.collection.mutable.HashMap where the key is Int and the value is ListBuffer[Int] (adjacency list). (EDIT: This has since been change to ArrayBuffer[Int]

I had done a similar thing a few months ago in C++, with a std::vector<int, std::vector<int> >.

What I'm trying to do now is run a metric between all pairs of vertices in the graph, so in C++ I did something like this:

// myVec = std::vector<int> of vertices
for (std::vector<int>::iterator iter = myVec.begin(); iter != myVec.end(); ++iter) {
    for (std::vector<int>::iterator iter2 = myVec.begin(); 
        iter2 != myVec.end(); ++iter2) {
        /* Run algorithm between *iter and *iter2 */
    }
}

I did the same thing in Scala, parallelized, (or tried to) by doing this:

// vertexList is a List[Int] (NOW CHANGED TO Array[Int] - see below)
vertexList.par.foreach(u =>
  vertexList.foreach(v =>
    /* Run algorithm between u and v */
  )
)

The C++ version is clearly single-threaded, the Scala version has .par so it's using parallel collections and is multi-threaded on 8 cores (same machine). However, the C++ version processed 305,570 pairs in a span of roughly 3 days, whereas the Scala version so far has only processed 23,573 in 17 hours.

Assuming I did my math correctly, the single-threaded C++ version is roughly 3x faster than the Scala version. Is Scala really that much slower than C++, or am I completely mis-using Scala (I only recently started - I'm about 300 pages into Programming in Scala)?

Thanks! -kstruct

EDIT To use a while loop, do I do something like..

// Where vertexList is an Array[Int]
vertexList.par.foreach(u =>
  while (i <- 0 until vertexList.length) {
    /* Run algorithm between u and vertexList(i) */
  }
}

If you guys mean use a while loop for the entire thing, is there an equivalent of .par.foreach for whiles?

EDIT2 Wait a second, that code isn't even right - my bad. How would I parallelize this using while loops? If I have some var i that keeps track of the iteration, then wouldn't all threads be sharing that i?

回答1:

From your comments, I see that your updating a shared mutable HashMap at the end of each algorithm run. And if you're randomizing your walks, a shared Random is also a contention point.

I recommend two changes:

Use .map and .flatMap to return an immutable collection instead of modifying a shared collection.
Use a ThreadLocalRandom (from either Akka or Java 7) to reduce contention on the random number generator
Check the rest of your algorithm for further possible contention points.
You may try running the inner loop in parallel, too. But without knowing your algorithm, it's hard to know if that will help or hurt. Fortunately, running all combinations of parallel and sequential collections is very simple; just switch out pVertexList and vertexList in the code below.

Something like this:

val pVertexList = vertexList.par
val allResult = for {
  u <- pVertexList
  v <- pVertexList
} yield {
  /* Run algorithm between u and v */
  ((u -> v) -> result)
}

The value allResult will be a ParVector[((Int, Int), Int)]. You may call .toMap on it to convert that into a Map.

回答2:

Why mutable? I don't think there's a good parallel mutable map on Scala 2.9.x -- particularly because just such a data structure was added to the upcoming Scala 2.10.

On the other hand... you have a List[Int]? Don't use that, use a Vector[Int]. Also, are you sure you aren't wasting time elsewhere, doing the conversions from your mutable maps and buffers into immutable lists? Scala data structures are different than C++'s so you might well be incurring in complexity problems elsewhere in the code.

Finally, I think dave might be onto something when he asks about contention. If you have contention, parallelism might well make things slower. How faster/slower does it run if you do not make it parallel? If making it not parallel makes it faster, then you most likely do have contention issues.

回答3:

I'm not completely sure about it, but I think foreach loops in foreach loops are rather slow, because lots of objects get created. See: http://scala-programming-language.1934581.n4.nabble.com/for-loop-vs-while-loop-performance-td1935856.html

Try rewriting it using a while loop.

Also Lists are only efficient for head access, Arrays are probably faster.

来源：https://stackoverflow.com/questions/9742938/efficiency-scalability-of-parallel-collections-in-scala-graphs

标签

scala

graph

scala-collections

parallel-collections