I proceed with java 8 learning.
I have found an interesting behavior:
let\'s see code sample:
// identity value and accumulator and combiner
I have a slightly different perspective here. Although @user43968's answer gives a plausible justification why the identity is needed for parallelism, is that really necessary? I believe not because associativity of the binary operator itself is enough to allow us to parallelize the reduce job.
Given an expression A op B op C op D, associativity guarantees that its evaluation is equivalent to (A op B) op (C op D), such that we can evaluate the sub expressions (A op B) and (C op D) in parallel and combine the results afterward without changing the final result. For example, with addition operation, initial value = 10, and L = [1, 2, 3], we want to compute 10 + 1 + 2 + 3 = 16. We should be okay to compute 10 + 1 = 11 and 2 + 3 = 5 in parallel and do 11 + 5 = 16 at last.
The only reason why Java requires the initial value to be an identity I can think of is because the language developers wanted to make the implementation simple and all parallelized sub jobs symmetric. Otherwise, they may have had to differentiate the first sub job that takes an initial value as input vs other sub jobs that don't. Now, they just need to equally distribute the initial value to each sub job, which is also a "reduce" by its own.
However, it's more about the implementation limitation, which should not be surfaced to the language users IMO. My gut feeling tells me there must exist a simple implementation that does not require the initial value to be an identity.