Java 8: First use of stream() or parallelStream() very slow - Usage in practice meaningful?

时光总嘲笑我的痴心妄想 提交于 2021-02-18 23:03:45

问题


In the last few days I made some test with external iteration, streams and parallelStreams in Java 8 and measured the duration of the execution time. I also read about the warm up time which I have to consider. But one question still remains.

The first time when I call the method stream() or parallelStream() on a collection the execution time is higher than it is for an external iteration. I already know, that when I call the stream() or parallelStream() more often on the same collection and avarage the execution time, then the parallelStream() is indeed faster than the external iteration. But since in practice a collection is also often only iterate once, I only see an disadvantage in using streams or parallelstreams.

So my question is:

If I only iterate an collection once, is it a good idea to use stream or parallelStream() or will the execution time always be higher than for external iteration?


回答1:


Entirely coincidentally (apparently), Doug Lea, Brian Goetz, and several other folks have written a document called Stream Parallel Guidance. (This is only a draft.) It does have some useful discussion about when to use parallel vs. sequential streams.

A brief summary: a parallel stream is more expensive to start up than a sequential stream. If your workload is splittable, and you have multiple CPU cores that can be brought to bear on the problem, and if the per-element cost isn't unreasonably small, you'll get a parallel speedup with a sufficiently large workload. (How's that for a lot of conditionals?) Oh, and you also have to be careful about benchmarking.

StackOverflow is littered with questions that attempt to add up a few integers in parallel and then claim that parallel streams are no good because they don't provide any speedup. I won't even bother linking to them.

Now, you had asked about "external iteration" (basically a for-loop) vs streams, parallel or sequential. I think it's important consider parallel vs sequential streams, as I've done above. This will help inform further decisions. Clearly, if there is a possibility you'll need to run things in parallel, then you should probably go with streams, even if you initially start sequentially.

Even if you don't intend to go parallel, there are still a number of considerations between for-loops and sequential streams. There is a certain amount of overhead of streams compared to conventional loops -- especially for-loops over an array. But this is usually amortized over the workload. Even if the collection is iterated only once, amortization of the setup can occur if the number of elements in the collection is sufficiently large. For example, if the collection has 10 elements, the extra setup cost of a stream probably isn't worth it. If the collection has 10,000 elements, it might be a different story.

For-loops over arrays are particularly fast because the only "setup" is initializing loop counters and limit values in registers. JIT compilers can bring many loop optimizations to bear as well. It's rare for sequential streams to beat a for-loop over an array, though it can happen.

For-loops over collections usually involve creating an iterator and thus have somewhat more overhead than array-based loops. In particular, each iteration on an iterator involves method calls to hasNext and next whereas a stream can get each element with a single method call. For this reason there are times a sequential stream can beat a iterator-based loop (given the right per-element workload, a sufficiently large number of elements, etc.). So even though there is some setup cost for a stream, there is also the possibility that it might end up running faster than a conventional for-loop.

Finally, performance isn't the only consideration. There is also readability and maintainability. The streams and lambda stuff may initially be new and unfamiliar, but it has great potential to simplify and clean up code. See my answer to another question, for example.



来源:https://stackoverflow.com/questions/25625250/java-8-first-use-of-stream-or-parallelstream-very-slow-usage-in-practice

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!