Why are compilers so stupid?

前端 未结 29 2007
借酒劲吻你
借酒劲吻你 2020-11-29 18:07

I always wonder why compilers can\'t figure out simple things that are obvious to the human eye. They do lots of simple optimizations, but never something even a little bit

29条回答
  •  无人及你
    2020-11-29 18:45

    Well, I can only speak of C++, because I'm a Java beginner totally. In C++, compilers are free to disregard any language requirements placed by the Standard, as long as the observable behavior is as-if the compiler actually emulated all the rules that are placed by the Standard. Observable behavior is defined as any reads and writes to volatile data and calls to library functions. Consider this:

    extern int x; // defined elsewhere
    for (int i = 0; i < 100 * 1000 * 1000 * 1000; ++i) {
        x += x + x + x + x + x;
    }
    return x;
    

    The C++ compiler is allowed to optimize out that piece of code and just add the proper value to x that would result from that loop once, because the code behaves as-if the loop never happened, and no volatile data, nor library functions are involved that could cause side effects needed. Now consider volatile variables:

    extern volatile int x; // defined elsewhere
    for (int i = 0; i < 100 * 1000 * 1000 * 1000; ++i) {
        x += x + x + x + x + x;
    }
    return x;
    

    The compiler is not allowed to do the same optimization anymore, because it can't prove that side effects caused by writing to x could not affect the observable behavior of the program. After all, x could be set to a memory cell watched by some hardware device that would trigger at every write.


    Speaking of Java, I have tested your loop, and it happens that the GNU Java Compiler (gcj) takes in inordinate amount of time to finish your loop (it simply didn't finish and I killed it). I enabled optimization flags (-O2) and it happened it printed out 0 immediately:

    [js@HOST2 java]$ gcj --main=Optimize -O2 Optimize.java
    [js@HOST2 java]$ ./a.out
    0
    [js@HOST2 java]$
    

    Maybe that observation could be helpful in this thread? Why does it happen to be so fast for gcj? Well, one reason surely is that gcj compiles into machine code, and so it has no possibility to optimize that code based on runtime behavior of the code. It takes all its strongness together and tries to optimize as much as it can at compile time. A virtual machine, however, can compile code Just in Time, as this output of java shows for this code:

    class Optimize {
        private static int doIt() {
            int x = 0;
            for (int i = 0; i < 100 * 1000 * 1000 * 1000; ++i) {
                x += x + x + x + x + x;
            }
            return x;
        }
        public static void main(String[] args) {
            for(int i=0;i<5;i++) {
                doIt();
            }
        }
    }
    

    Output for java -XX:+PrintCompilation Optimize:

    1       java.lang.String::hashCode (60 bytes)
    1%      Optimize::doIt @ 4 (30 bytes)
    2       Optimize::doIt (30 bytes)
    

    As we see, it JIT compiles the doIt function 2 times. Based on the observation of the first execution, it compiles it a second time. But it happens to have the same size as bytecode two times, suggesting the loop is still in place.

    As another programmer shows, execution time for certain dead loops even is increased for some cases for subsequently compiled code. He reported a bug which can be read here, and is as of 24. October 2008.

提交回复
热议问题