How exactly does gcc do optimizations?

可紊 提交于 2019-12-04 08:17:11

Asking "why" about optimizers is usually a waste of time, because there are no "rules" by which optimizers operate -- other than "as if": The optimizer may not change the observable behaviour of conforming code.

The "observable behaviour" of both your programs is to print "hello" repeatedly.

In your first program, the counting is optimized away, making the observable behaviour happen faster. That is the job of an optimizer. Be happy your code is more efficient now!

In your second program, the counting is not optimized away, because somehow the optimizer -- in this version of this compiler with this setting -- did not see that it could do without it. Why? Who knows (other than the maintainer of the compiler's optimizer module)?

If your desired behaviour is to have a delay between outputs, use something like thrd_sleep(). Empty count loops were a way to delay BASIC 2.0 programs on the C64, but they should not be used in C, for the exact reason you just observed: You never know what the optimizer does.

The branching in the if statement now depends on something that happened in the previous iteration of the loop. In particular, the compiler can easily determine in program 1 that i is incremented in every iteration of the while loop (as it is right at the top), while this is not the case in program 2.

Anyway, compiler optimizations are very complicated. See below:

gcc -O2 is a shortcut for these flags: (from the documentation)

      -fauto-inc-dec 
      -fbranch-count-reg 
      -fcombine-stack-adjustments 
      -fcompare-elim 
      -fcprop-registers 
      -fdce 
      -fdefer-pop 
      -fdelayed-branch 
      -fdse 
      -fforward-propagate 
      -fguess-branch-probability 
      -fif-conversion2 
      -fif-conversion 
      -finline-functions-called-once 
      -fipa-pure-const 
      -fipa-profile 
      -fipa-reference 
      -fmerge-constants 
      -fmove-loop-invariants 
      -freorder-blocks 
      -fshrink-wrap 
      -fsplit-wide-types 
      -fssa-backprop 
      -fssa-phiopt 
      -ftree-bit-ccp 
      -ftree-ccp 
      -ftree-ch 
      -ftree-coalesce-vars 
      -ftree-copy-prop 
      -ftree-dce 
      -ftree-dominator-opts 
      -ftree-dse 
      -ftree-forwprop 
      -ftree-fre 
      -ftree-phiprop 
      -ftree-sink 
      -ftree-slsr 
      -ftree-sra 
      -ftree-pta 
      -ftree-ter 
      -funit-at-a-time
      -fthread-jumps 
      -falign-functions  -falign-jumps 
      -falign-loops  -falign-labels 
      -fcaller-saves 
      -fcrossjumping 
      -fcse-follow-jumps  -fcse-skip-blocks 
      -fdelete-null-pointer-checks 
      -fdevirtualize -fdevirtualize-speculatively 
      -fexpensive-optimizations 
      -fgcse  -fgcse-lm  
      -fhoist-adjacent-loads 
      -finline-small-functions 
      -findirect-inlining 
      -fipa-cp 
      -fipa-cp-alignment 
      -fipa-sra 
      -fipa-icf 
      -fisolate-erroneous-paths-dereference 
      -flra-remat 
      -foptimize-sibling-calls 
      -foptimize-strlen 
      -fpartial-inlining 
      -fpeephole2 
      -freorder-blocks-algorithm=stc 
      -freorder-blocks-and-partition -freorder-functions 
      -frerun-cse-after-loop  
      -fsched-interblock  -fsched-spec 
      -fschedule-insns  -fschedule-insns2 
      -fstrict-aliasing -fstrict-overflow 
      -ftree-builtin-call-dce 
      -ftree-switch-conversion -ftree-tail-merge 
      -ftree-pre 
      -ftree-vrp 
      -fipa-ra

Each of these flags corresponds to a different possible optimization the compiler is allowed to make.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!