Is .collect guaranteed to be ordered on parallel streams?

前端 未结 2 655
深忆病人
深忆病人 2020-12-07 16:41

Given I have a list of Strings List toProcess. The results have to be in the order the original lines were given. I want to utilize the new parall

2条回答
  •  萌比男神i
    2020-12-07 17:07

    TL;DR

    Yes, the order is guaranteed.

    Stream.collect() API documentation

    The starting place is to look at what determines whether a reduction is concurrent or not. Stream.collect()'s description says the following:

    If the stream is parallel, and the Collector is concurrent, and either the stream is unordered or the collector is unordered, then a concurrent reduction will be performed (see Collector for details on concurrent reduction.)

    The first condition is satisfied: the stream is parallel. How about the second and third: is the Collector concurrent and unordered?
     

    Collectors.toList() API documentation

    toList()'s documentation reads:

    Returns a Collector that accumulates the input elements into a new List. There are no guarantees on the type, mutability, serializability, or thread-safety of the List returned; if more control over the returned List is required, use toCollection(Supplier).

    Returns:
    a Collector which collects all the input elements into a List, in encounter order

    An operation that works in encounter order operates on the elements in their original order. This overrides parallelness.
     

    Implementation code

    Inspecting the implementation of Collectors.java confirms that toList() does not include the CONCURRENT or UNORDERED traits.

    public static 
    Collector> toList() {
        return new CollectorImpl<>((Supplier>) ArrayList::new, List::add,
                                   (left, right) -> { left.addAll(right); return left; },
                                   CH_ID);
    }
    
    // ...
    
    static final Set CH_ID
            = Collections.unmodifiableSet(EnumSet.of(Collector.Characteristics.IDENTITY_FINISH));
    

    Notice how the collector has the CH_ID trait set, which has only the single IDENTITY_FINISH trait. CONCURRENT and UNORDERED are not there, so the reduction cannot be concurrent.

    A non-concurrent reduction means that, if the stream is parallel, collection can proceed in parallel, but it will be split into several thread-confined intermediate results which are then combined. This ensures the combined result is in encounter order.
     

    See also: Why parallel stream get collected sequentially in Java 8

提交回复
热议问题