c++11 fast constexpr integer powers

后端 未结 2 1530
再見小時候
再見小時候 2020-12-17 18:51

Beating the dead horse here. A typical (and fast) way of doing integer powers in C is this classic:

int64_t ipow(int64_t base, int exp){
  int64_t result = 1         


        
相关标签:
2条回答
  • 2020-12-17 19:03

    It seems that this is a standard problem with constexpr and template programming in C++. Due to compile time constraints, the constexpr version is slower than a normal version if executed at runtime. But overloading doesn't allows to chose the correct version. The standardization committee is working on this issue. See for example the following working document http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2013/n3583.pdf

    0 讨论(0)
  • 2020-12-17 19:10

    A good optimizing compiler will transform tail-recursive functions to run as fast as imperative code. You can transform this function to be tail recursive with pumping. GCC 4.8.1 compiles this test program:

    #include <cstdint>
    
    constexpr int64_t ipow(int64_t base, int exp, int64_t result = 1) {
      return exp < 1 ? result : ipow(base*base, exp/2, (exp % 2) ? result*base : result);
    }
    
    int64_t foo(int64_t base, int exp) {
      return ipow(base, exp);
    }
    

    into a loop (See this at gcc.godbolt.org):

    foo(long, int):
        testl   %esi, %esi
        movl    $1, %eax
        jle .L4
    .L3:
        movq    %rax, %rdx
        imulq   %rdi, %rdx
        testb   $1, %sil
        cmovne  %rdx, %rax
        imulq   %rdi, %rdi
        sarl    %esi
        jne .L3
        rep; ret
    .L4:
        rep; ret
    

    vs. your while loop implementation:

    ipow(long, int):
        testl   %esi, %esi
        movl    $1, %eax
        je  .L4
    .L3:
        movq    %rax, %rdx
        imulq   %rdi, %rdx
        testb   $1, %sil
        cmovne  %rdx, %rax
        imulq   %rdi, %rdi
        sarl    %esi
        jne .L3
        rep; ret
    .L4:
        rep; ret
    

    Instruction-by-instruction identical is good enough for me.

    0 讨论(0)
提交回复
热议问题