Constant embedded for loop condition optimization in C++ with gcc

问题

Will a compiler optimize tihs:

bool someCondition = someVeryTimeConsumingTask(/* ... */);

for (int i=0; i<HUGE_INNER_LOOP; ++i)
{
    if (someCondition)
        doCondition(i);
    else
        bacon(i);
}

into:

bool someCondition = someVeryTimeConsumingTask(/* ... */);

if (someCondition)
    for (int i=0; i<HUGE_INNER_LOOP; ++i)
        doCondition(i);
else
    for (int i=0; i<HUGE_INNER_LOOP; ++i)
        bacon(i);

someCondition is trivially constant within the for loop.

This may seem obvious and that I should do this myself, but if you have more than one condition then you are dealing with permuatations of for loops, so the code would get quite a bit longer. I am deciding on whether to do it (I am already optimizing) or whether it will be a waste of my time.

回答1:

It's possible that the compiler might write the code as you did, but I've never seen such optimization.

However there is something called branch prediction in modern CPU. In essence it means that when the processor is asked to execute a conditional jump, it'll start to execute what is judged to be the likeliest branch before evaluating the condition. This is done to keep the pipeline full of instructions.

In case the processor fails (and takes the bad branch) it cause a flush of the pipeline: it's called a misprediction.

A very common trait of this feature is that if the same test produce the same result several times in a row, then it'll be considered to produce the same result by the branch prediction algorithm... which is of course tailored for loops :)

It makes me smile because you are worrying about the if within the for body while the for itself causes a branch prediction >> the condition must be evaluated at each iteration to check whether or not to continue ;)

So, don't worry about it, it costs less than a cache miss.

Now, if you really are worried about this, there is always the functor approach.

typedef void (*functor_t)(int);

functor_t func = 0;
if (someCondition) func = &doCondition;
else func = &bacon;

for (int i=0; i<HUGE_INNER_LOOP; ++i) (*func)(i);

which sure looks much better, doesn't it ? The obvious drawback is the necessity for compatible signatures, but you can write wrappers around the functions for that. As long as you don't need to break/return, you'll be fine with this. Otherwise you would need a if in the loop body :D

回答2:

It does not seem to do so with either -O2 or -O3 optimisations. This is something you can (and should, if you are concerned with optimisation) test for yourself - compile with the optimisation you are interested in and examine the emitted assembly language.

回答3:

Have you profiled your app to find out where the slowdowns are? If not, why are you even thinking about optimization? Until you know which methods need to be optimized, you're wasting your time worrying about micro-optimizations like this.

Is this the location of the slowdown? If so, then what you're doing may be useful. Yes, the compiler may optimize this, but there's no guarantee that it does. If this isn't the location of the slowdown, then look elsewhere; the cost of one additional branch every time through the loop is probably trivial relative to all of the other work you're doing.

来源：https://stackoverflow.com/questions/2883353/constant-embedded-for-loop-condition-optimization-in-c-with-gcc

标签

c++

optimization

gcc

g++