Best way performance wise to use limit on stream in case of multithreading

后端 未结 3 1033
迷失自我
迷失自我 2021-01-15 18:24

I watched a talk by José Paumard on InfoQ : http://www.infoq.com/fr/presentations/jdk8-lambdas-streams-collectors (French)

The thing is I got stuck on this one poin

3条回答
  •  执笔经年
    2021-01-15 19:07

    Why is that the second code better?

    In the first case you create infinite source, split it for parallel execution to a bunch of tasks each providing an infinite number of elements, then limit the overall size of the result. Even though the source is unordered, this implies some overhead. In this case individual tasks should talk to each other to check when the overall size is reached. If they talk often, this increases the contention. If they talk less, they actually produce more numbers than necessary and then drop some of them. I believe, actual stream API implementation is to talk less between tasks, but this actually leads to produce more numbers than necessary. This also increases memory consumption and activates garbage collector.

    In contrast in the second case you create a finite source of known size. When the task is split into subtasks, their sizes are also well-defined and in total they produce exactly the requested number of random numbers without the necessity to talk to each other at all. That's why it's faster.

    Is there a better, or at least less costly way to do it?

    The biggest problem in your code samples is boxing. If you need 10_000_000 random numbers, it's very bad idea to box each of them and store in the List: you create tons of unnecessary objects, perform many heap allocations and so on. Replace this with primitive streams:

    long[] randomNumbers = ThreadLocalRandom.current().longs(10_000_000).parallel().toArray();
    

    This would be much much faster (probably an order of magnitude).

    Also you may consider new Java-8 SplittableRandom class. It provides roughly the same performance, but the generated random numbers have much higher quality (including passing of DieHarder 3.31.1):

    long[] randomNumbers = new SplittableRandom().longs(10_000_000).parallel().toArray();
    

提交回复
热议问题