I was watching a presentation on Java, and at one point, the lecturer said:
\"Mutability is OK, sharing is nice, shared mutability is devil\'s work.\"
The thing is that the lecture is slightly wrong at the same time. The example that he provided uses forEach, which is documented as:
The behavior of this operation is explicitly nondeterministic. For parallel stream pipelines, this operation does not guarantee to respect the encounter order of the stream, as doing so would sacrifice the benefit of parallelism...
You could use:
numbers.stream()
.filter(e -> e % 2 == 0)
.map(e -> e * 2)
.parallel()
.forEachOrdered(e -> doubleOfEven.add(e));
And you would always have the same guaranteed result.
On the other hand the example that uses Collectors.toList is better, because Collectors respect encounter order, so it works just fine.
Interesting point is that Collectors.toList uses ArrayList underneath that is not a thread safe collection. It's just that is uses many of them (for parallel processing) and merges at the end.
A last note that parallel and sequential do not influence the encounter order, it's the operation applied to the Stream that do. Excellent read here.
We also need to think that even using a thread safe collection is still not safe with Streams completely, especially when you are relying on side-effects.
List numbers = Arrays.asList(1, 3, 3, 5);
Set seen = Collections.synchronizedSet(new HashSet<>());
List collected = numbers.stream()
.parallel()
.map(e -> {
if (seen.add(e)) {
return 0;
} else {
return e;
}
})
.collect(Collectors.toList());
System.out.println(collected);
collected at this point could be [0,3,0,0] OR [0,0,3,0] or something else.