Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)?

前端 未结 12 1683
孤街浪徒
孤街浪徒 2020-11-22 08:41

I am doing some numerical optimization on a scientific application. One thing I noticed is that GCC will optimize the call pow(a,2) by compiling it into a

12条回答
  •  温柔的废话
    2020-11-22 09:26

    gcc actually can do this optimization, even for floating-point numbers. For example,

    double foo(double a) {
      return a*a*a*a*a*a;
    }
    

    becomes

    foo(double):
        mulsd   %xmm0, %xmm0
        movapd  %xmm0, %xmm1
        mulsd   %xmm0, %xmm1
        mulsd   %xmm1, %xmm0
        ret
    

    with -O -funsafe-math-optimizations. This reordering violates IEEE-754, though, so it requires the flag.

    Signed integers, as Peter Cordes pointed out in a comment, can do this optimization without -funsafe-math-optimizations since it holds exactly when there is no overflow and if there is overflow you get undefined behavior. So you get

    foo(long):
        movq    %rdi, %rax
        imulq   %rdi, %rax
        imulq   %rdi, %rax
        imulq   %rax, %rax
        ret
    

    with just -O. For unsigned integers, it's even easier since they work mod powers of 2 and so can be reordered freely even in the face of overflow.

提交回复
热议问题