When should streams be preferred over traditional loops for best performance? Do streams take advantage of branch-prediction?

前端 未结 5 1169
旧时难觅i
旧时难觅i 2020-12-13 17:05

I just read about Branch-Prediction and wanted to try how this works with Java 8 Streams.

However the performance with Streams is always turning out to be wors

5条回答
  •  春和景丽
    2020-12-13 17:53

    Everything is said, but I want to show you how your code should look like using JMH.

    @Fork(3)
    @BenchmarkMode(Mode.AverageTime)
    @Measurement(iterations = 10, timeUnit = TimeUnit.NANOSECONDS)
    @State(Scope.Benchmark)
    @Threads(1)
    @Warmup(iterations = 5, timeUnit = TimeUnit.NANOSECONDS)
    @OutputTimeUnit(TimeUnit.NANOSECONDS)
    public class MyBenchmark {
    
      private final int totalSize = 32_768;
      private final int filterValue = 1_280;
      private final int loopCount = 10_000;
      // private Random rnd;
    
      private int[] array;
    
      @Setup
      public void setup() {
        array = IntStream.range(0, totalSize).toArray();
    
        // rnd = new Random(0);
        // array = rnd.ints(totalSize).map(i -> i % 2560).toArray();
      }
    
      @Benchmark
      public long conditionalOperatorTime() {
        long sum = 0;
        for (int j = 0; j < loopCount; j++) {
          for (int c = 0; c < totalSize; ++c) {
            sum += array[c] >= filterValue ? array[c] : 0;
          }
        }
        return sum;
      }
    
      @Benchmark
      public long branchStatementTime() {
        long sum = 0;
        for (int j = 0; j < loopCount; j++) {
          for (int c = 0; c < totalSize; ++c) {
            if (array[c] >= filterValue) {
              sum += array[c];
            }
          }
        }
        return sum;
      }
    
      @Benchmark
      public long streamsTime() {
        long sum = 0;
        for (int j = 0; j < loopCount; j++) {
          sum += IntStream.of(array).filter(value -> value >= filterValue).sum();
        }
        return sum;
      }
    
      @Benchmark
      public long parallelStreamsTime() {
        long sum = 0;
        for (int j = 0; j < loopCount; j++) {
          sum += IntStream.of(array).parallel().filter(value -> value >= filterValue).sum();
        }
        return sum;
      }
    }
    

    The results for a sorted array:

    Benchmark                            Mode  Cnt           Score           Error  Units
    MyBenchmark.branchStatementTime      avgt   30   119833793,881 ±   1345228,723  ns/op
    MyBenchmark.conditionalOperatorTime  avgt   30   118146194,368 ±   1748693,962  ns/op
    MyBenchmark.parallelStreamsTime      avgt   30   499436897,422 ±   7344346,333  ns/op
    MyBenchmark.streamsTime              avgt   30  1126768177,407 ± 198712604,716  ns/op
    

    Results for unsorted data:

    Benchmark                            Mode  Cnt           Score           Error  Units
    MyBenchmark.branchStatementTime      avgt   30   534932594,083 ±   3622551,550  ns/op
    MyBenchmark.conditionalOperatorTime  avgt   30   530641033,317 ±   8849037,036  ns/op
    MyBenchmark.parallelStreamsTime      avgt   30   489184423,406 ±   5716369,132  ns/op
    MyBenchmark.streamsTime              avgt   30  1232020250,900 ± 185772971,366  ns/op
    

    I only can say that there are many possibilities of JVM optimizations and maybe branch-prediction is also involved. Now it is up to you to interpret the benchmark results.

提交回复
热议问题