Stream.skip behavior with unordered terminal operation

前端 未结 2 1550
醉梦人生
醉梦人生 2020-11-30 04:15

I\'ve already read this and this questions, but still doubt whether the observed behavior of Stream.skip was intended by JDK authors.

Let\'s have simple

2条回答
  •  野趣味
    野趣味 (楼主)
    2020-11-30 04:28

    @Ruben, you probably don't understand my question. Roughly the problem is: why unordered().collect(toCollection(HashSet::new)) behaves differently than collect(toSet()). Of course I know that toSet() is unordered.

    Probably, but, anyway, I will give it a second try.

    Having a look at the Javadocs of Collectors toSet and toCollection we can see that toSet delivers an unordered collector

    This is an {@link Collector.Characteristics#UNORDERED unordered} Collector.

    i.e., a CollectorImpl with the UNORDERED Characteristic. Having a look at the Javadoc of Collector.Characteristics#UNORDERED we can read:

    Indicates that the collection operation does not commit to preserving the encounter order of input elements

    In the Javadocs of Collector we can also see:

    For concurrent collectors, an implementation is free to (but not required to) implement reduction concurrently. A concurrent reduction is one where the accumulator function is called concurrently from multiple threads, using the same concurrently-modifiable result container, rather than keeping the result isolated during accumulation. A concurrent reduction should only be applied if the collector has the {@link Characteristics#UNORDERED} characteristics or if the originating data is unordered

    This means to me that, if we set the UNORDERED characteristic, we do not care at all about the order in which the elements of the stream get passed to the accumulator, and, therefore, the elements can be extracted from the pipeline in any order.

    Btw, you get the same behavior if you omit the unordered() in your example:

        System.out.println("skip-toSet: "
                + input.parallelStream().filter(x -> x > 0)
                    .skip(1)
                    .collect(Collectors.toSet()));
    

    Furthermore, the skip() method in Stream gives us a hint:

    While {@code skip()} is generally a cheap operation on sequential stream pipelines, it can be quite expensive on ordered parallel pipelines

    and

    Using an unordered stream source (such as {@link #generate(Supplier)}) or removing the ordering constraint with {@link #unordered()} may result in significant speedups

    When using

    Collectors.toCollection(HashSet::new)
    

    you are creating a normal "ordered" Collector (one without the UNORDERED characteristic), what to me means that you do care about the ordering, and, therefore, the elements are being extracted in order and you get the expected behavior.

提交回复
热议问题