I just read about Branch-Prediction and wanted to try how this works with Java 8 Streams.
However the performance with Streams is always turning out to be wors
Everything is said, but I want to show you how your code should look like using JMH.
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@Measurement(iterations = 10, timeUnit = TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
@Threads(1)
@Warmup(iterations = 5, timeUnit = TimeUnit.NANOSECONDS)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class MyBenchmark {
private final int totalSize = 32_768;
private final int filterValue = 1_280;
private final int loopCount = 10_000;
// private Random rnd;
private int[] array;
@Setup
public void setup() {
array = IntStream.range(0, totalSize).toArray();
// rnd = new Random(0);
// array = rnd.ints(totalSize).map(i -> i % 2560).toArray();
}
@Benchmark
public long conditionalOperatorTime() {
long sum = 0;
for (int j = 0; j < loopCount; j++) {
for (int c = 0; c < totalSize; ++c) {
sum += array[c] >= filterValue ? array[c] : 0;
}
}
return sum;
}
@Benchmark
public long branchStatementTime() {
long sum = 0;
for (int j = 0; j < loopCount; j++) {
for (int c = 0; c < totalSize; ++c) {
if (array[c] >= filterValue) {
sum += array[c];
}
}
}
return sum;
}
@Benchmark
public long streamsTime() {
long sum = 0;
for (int j = 0; j < loopCount; j++) {
sum += IntStream.of(array).filter(value -> value >= filterValue).sum();
}
return sum;
}
@Benchmark
public long parallelStreamsTime() {
long sum = 0;
for (int j = 0; j < loopCount; j++) {
sum += IntStream.of(array).parallel().filter(value -> value >= filterValue).sum();
}
return sum;
}
}
The results for a sorted array:
Benchmark Mode Cnt Score Error Units
MyBenchmark.branchStatementTime avgt 30 119833793,881 ± 1345228,723 ns/op
MyBenchmark.conditionalOperatorTime avgt 30 118146194,368 ± 1748693,962 ns/op
MyBenchmark.parallelStreamsTime avgt 30 499436897,422 ± 7344346,333 ns/op
MyBenchmark.streamsTime avgt 30 1126768177,407 ± 198712604,716 ns/op
Results for unsorted data:
Benchmark Mode Cnt Score Error Units
MyBenchmark.branchStatementTime avgt 30 534932594,083 ± 3622551,550 ns/op
MyBenchmark.conditionalOperatorTime avgt 30 530641033,317 ± 8849037,036 ns/op
MyBenchmark.parallelStreamsTime avgt 30 489184423,406 ± 5716369,132 ns/op
MyBenchmark.streamsTime avgt 30 1232020250,900 ± 185772971,366 ns/op
I only can say that there are many possibilities of JVM optimizations and maybe branch-prediction is also involved. Now it is up to you to interpret the benchmark results.