Why is 2 * (i * i) faster than 2 * i * i in Java?

前端 未结 10 746
一生所求
一生所求 2020-12-22 14:43

The following Java program takes on average between 0.50 secs and 0.55 secs to run:

public static void main(String[] args) {
    long startTime = System.nano         


        
10条回答
  •  难免孤独
    2020-12-22 15:02

    I tried a JMH using the default archetype: I also added an optimized version based on Runemoro's explanation.

    @State(Scope.Benchmark)
    @Warmup(iterations = 2)
    @Fork(1)
    @Measurement(iterations = 10)
    @OutputTimeUnit(TimeUnit.NANOSECONDS)
    //@BenchmarkMode({ Mode.All })
    @BenchmarkMode(Mode.AverageTime)
    public class MyBenchmark {
      @Param({ "100", "1000", "1000000000" })
      private int size;
    
      @Benchmark
      public int two_square_i() {
        int n = 0;
        for (int i = 0; i < size; i++) {
          n += 2 * (i * i);
        }
        return n;
      }
    
      @Benchmark
      public int square_i_two() {
        int n = 0;
        for (int i = 0; i < size; i++) {
          n += i * i;
        }
        return 2*n;
      }
    
      @Benchmark
      public int two_i_() {
        int n = 0;
        for (int i = 0; i < size; i++) {
          n += 2 * i * i;
        }
        return n;
      }
    }
    

    The result are here:

    Benchmark                           (size)  Mode  Samples          Score   Score error  Units
    o.s.MyBenchmark.square_i_two           100  avgt       10         58,062         1,410  ns/op
    o.s.MyBenchmark.square_i_two          1000  avgt       10        547,393        12,851  ns/op
    o.s.MyBenchmark.square_i_two    1000000000  avgt       10  540343681,267  16795210,324  ns/op
    o.s.MyBenchmark.two_i_                 100  avgt       10         87,491         2,004  ns/op
    o.s.MyBenchmark.two_i_                1000  avgt       10       1015,388        30,313  ns/op
    o.s.MyBenchmark.two_i_          1000000000  avgt       10  967100076,600  24929570,556  ns/op
    o.s.MyBenchmark.two_square_i           100  avgt       10         70,715         2,107  ns/op
    o.s.MyBenchmark.two_square_i          1000  avgt       10        686,977        24,613  ns/op
    o.s.MyBenchmark.two_square_i    1000000000  avgt       10  652736811,450  27015580,488  ns/op
    

    On my PC (Core i7 860 - it is doing nothing much apart from reading on my smartphone):

    • n += i*i then n*2 is first
    • 2 * (i * i) is second.

    The JVM is clearly not optimizing the same way than a human does (based on Runemoro's answer).

    Now then, reading bytecode: javap -c -v ./target/classes/org/sample/MyBenchmark.class

    • Differences between 2*(i*i) (left) and 2*i*i (right) here: https://www.diffchecker.com/cvSFppWI
    • Differences between 2*(i*i) and the optimized version here: https://www.diffchecker.com/I1XFu5dP

    I am not expert on bytecode, but we iload_2 before we imul: that's probably where you get the difference: I can suppose that the JVM optimize reading i twice (i is already here, and there is no need to load it again) whilst in the 2*i*i it can't.

提交回复
热议问题