Why can't clang and gcc optimize away this int-to-float conversion?

◇◆丶佛笑我妖孽 提交于 2019-12-24 00:24:53

问题


Consider the following code:

void foo(float* __restrict__ a)
{
    int i; float val;
    for (i = 0; i < 100; i++) {
        val = 2 * i;
        a[i] = val;
    }
}

void bar(float* __restrict__ a)
{
    int i; float val = 0.0;
    for (i = 0; i < 100; i++) {
        a[i] = val;
        val += 2.0;
    }
}

They're based on Examples 7.26a and 7.26b in Agner Fog's Optimizing software in C++ and should do the same thing; bar is more "efficient" as written in the sense that we don't do an integer-to-float conversion at every iteration, but rather a float addition which is cheaper (on x86_64).

Here are the clang and gcc results on these two functions (with no vectorization and unrolling).

Question: It seems to me that the optimization of replacing a multiplication by the loop index with an addition of a constant value - when this is beneficial - should be carried out by compilers, even if (or perhaps especially if) there's a type conversion involved. Why is this not happening for these two functions?

Note that if we use int's rather than float's:

void foo(int* __restrict__ a)
{
    int i; int val = 0;
    for (i = 0; i < 100; i++) {
        val = 2 * i;
        a[i] = val;
    }
}

void bar(int* __restrict__ a)
{
    int i; int val = 0;
    for (i = 0; i < 100; i++) {
        a[i] = val;
        val += 2;
    }
}

Both clang and gcc perform the expected optimization, albeit not quite in the same way (see this question).


回答1:


You are looking for enabling induction variable optimization for floating point numbers. This optimization is generally unsafe in floating point land as it changes program semantics. In your example it'll work because both initial value (0.0) and step (2.0) can be precisely represented in IEEE format but this is a rare case in practice.

It could be enabled under -ffast-math but it seems this wasn't considered as important case in GCC as it rejects non-integral induction variables early on (see tree-scalar-evolution.c).

If you believe that this is an important usecase you might consider filing request at GCC Bugzilla.



来源:https://stackoverflow.com/questions/48350805/why-cant-clang-and-gcc-optimize-away-this-int-to-float-conversion

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!