Java 8 stream processing not fluent [closed]

问题

I have a problem with Java 8 streams, where the data is processed in sudden bulks, rather than when they are requested. I have a rather complex stream-flow which has to be parallelised because I use concat to merge two streams.

My issue stems from the fact that data seems to be parsed in large bulks minutes - and sometimes even hours - apart. I would expect this processing to happen as soon as the Stream reads incoming data, to spread the workload. Bulk processing seems counterintuitive in almost every way.

So, the question is why this bulk-collection occurs and how I can avoid it.

My input is a Spliterator of unknown size and I use a forEach as the terminal operation.

回答1:

It’s a fundamental principle of parallel streams that the encounter order doesn’t have to match the processing order. This enables concurrent processing of items of sublists or subtrees while assembling a correctly ordered result, if necessary. This explicitly allows bulk processing and even makes it mandatory for the parallel processing of ordered streams.

This behavior is determined by the particular implementation of the Spliterator’s trySplit implementation. The specification says:

If this Spliterator is ORDERED, the returned Spliterator must cover a strict prefix of the elements

…

API Note:

An ideal trySplit method efficiently (without traversal) divides its elements exactly in half, allowing balanced parallel computation.

Why was this strategy fixed in the specification and not, e.g. an even/odd split?

Well, consider a simple use case. A list will be filtered and collected into a new list, thus the encounter order must be retained. With the prefix rule, it’s rather easy to implement. Split off a prefix, filter both chunks concurrently, afterwards, add the result of the prefix filtering to the new list, followed by adding the filtered suffix.

With an even odd strategy, that’s impossible. You may filter both parts concurrently, but afterwards, you don’t know how to join the results correctly unless you track each items position throughout the entire operation.

Even then, joining these geared items would be much more complicated than performing an addAll per chunk.

You might have noticed that this all applies only, if you have an encounter order that might have to be retained. If your spliterator doesn’t report an ORDERED characteristic, it is not required to return a prefix. Nevertheless, the default implementation you might have inherited by AbstractSpliterator is designed to be compatible with ordered spliterators. Thus, if you want a different strategy, you have to implement the split operation yourself.

Or you use a different way of implementing an unordered stream, e.g.

Stream.generate(()->{
    LockSupport.parkNanos(TimeUnit.SECONDS.toNanos(1));
    return Thread.currentThread().getName();
}).parallel().forEach(System.out::println);

might be closer to what you expected.

来源：https://stackoverflow.com/questions/32526921/java-8-stream-processing-not-fluent

标签

java

java-8

java-stream