What does the compiler do in assembly when optimizing code? ie -O2 flag

前端 未结 2 1332
感情败类
感情败类 2020-12-29 12:42

So when you add an optimization flag when compiling your C++, it runs faster, but how does this work? Could someone explain what really goes on in the assembly?

2条回答
  •  爱一瞬间的悲伤
    2020-12-29 13:09

    It means you're making the compiler do extra work / analysis at compile time, so you can reap the rewards of a few extra precious cpu cycles at runtime. Might be best to explain with an example.

    Consider a loop like this:

    const int n = 5;
    for (int i = 0; i < n; ++i)
      cout << "bleh" << endl;
    

    If you compile this without optimizations, the compiler will not do any extra work for you -- assembly generated for this code snippet will likely be a literal translation into compare and jump instructions. (which isn't the fastest, just the most straightforward)

    However, if you compile WITH optimizations, the compiler can easily inline this loop since it knows the upper bound can't ever change because n is const. (i.e. it can copy the repeated code 5 times directly instead of comparing / checking for the terminating loop condition).

    Here's another example with an optimized function call. Below is my whole program:

    #include 
    static int foo(int a, int b) {
      return a * b;
    } 
    
    
    int main(int argc, char** argv) {
      fprintf(stderr, "%d\n", foo(10, 15));
      return 0;
    }
    

    If i compile this code without optimizations using gcc foo.c on my x86 machine, my assembly looks like this:

    movq    %rsi, %rax
    movl    %edi, -4(%rbp)
    movq    %rax, -16(%rbp)
    movl    $10, %eax      ; these are my parameters to
    movl    $15, %ecx      ; the foo function
    movl    %eax, %edi
    movl    %ecx, %esi
    callq   _foo
    ; .. about 20 other instructions ..
    callq   _fprintf
    

    Here, it's not optimizing anything. It's loading the registers with my constant values and calling my foo function. But look if i recompile with the -O2 flag:

    movq    (%rax), %rdi
    leaq    L_.str(%rip), %rsi
    movl    $150, %edx
    xorb    %al, %al
    callq   _fprintf
    

    The compiler is so smart that it doesn't even call foo anymore. It just inlines it's return value.

提交回复
热议问题