Best way performance wise to use limit on stream in case of multithreading

后端 未结 3 1039
迷失自我
迷失自我 2021-01-15 18:24

I watched a talk by José Paumard on InfoQ : http://www.infoq.com/fr/presentations/jdk8-lambdas-streams-collectors (French)

The thing is I got stuck on this one poin

3条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-01-15 18:56

    JDK docs has good explanation of this behavior, it is ordering constraint that kills performance for parallel processing

    Text from doc for limit function - https://docs.oracle.com/javase/8/docs/api/java/util/stream/LongStream.html

    While limit() is generally a cheap operation on sequential stream pipelines, it can be quite expensive on ordered parallel pipelines, especially for large values of maxSize, since limit(n) is constrained to return not just any n elements, but the first n elements in the encounter order. Using an unordered stream source (such as generate(LongSupplier)) or removing the ordering constraint with BaseStream.unordered() may result in significant speedups of limit() in parallel pipelines, if the semantics of your situation permit. If consistency with encounter order is required, and you are experiencing poor performance or memory utilization with limit() in parallel pipelines, switching to sequential execution with sequential() may improve performance. Blockquote

提交回复
热议问题