Why and how does GCC compile a function with a missing return statement?

后端 未结 8 1023
萌比男神i
萌比男神i 2020-12-01 23:57
#include 

char toUpper(char);

int main(void)
{
    char ch, ch2;
    printf(\"lowercase input : \");
    ch = getchar();
    ch2 = toUpper(ch);
             


        
相关标签:
8条回答
  • 2020-12-02 00:35

    I have tried a small programm:

    #include <stdio.h>
    int f1() {
    }
    int main() {
        printf("TEST: <%d>\n",  f1());
        printf("TEST: <%d>\n",  f1());
        printf("TEST: <%d>\n",  f1());
        printf("TEST: <%d>\n",  f1());
        printf("TEST: <%d>\n",  f1());
    }
    

    Result:

    TEST: <1>

    TEST: <10>

    TEST: <11>

    TEST: <11>

    TEST: <11>

    I have used mingw32-gcc compiler, so there might be diferences.

    You could just play around and try e.g. a char function. As long you don't use the result value it will stil work fine.

    #include <stdio.h>
    char f1() {
    }
    int main() {
        f1();
    }
    

    But I stil would recommend to set either void function or give some return value.

    Your function seem to need a return:

    char toUpper(char c)
    {
        if(c>='a'&&c<='z')
            c = c - 32;
        return c;
    }
    
    0 讨论(0)
  • 2020-12-02 00:38

    What happened for you is that when the C program was compiled into assembly language, your toUpper function ended up like this, perhaps:

    _toUpper:
    LFB4:
            pushq   %rbp
    LCFI3:
            movq    %rsp, %rbp
    LCFI4:
            movb    %dil, -4(%rbp)
            cmpb    $96, -4(%rbp)
            jle     L8
            cmpb    $122, -4(%rbp)
            jg      L8
            movzbl  -4(%rbp), %eax
            subl    $32, %eax
            movb    %al, -4(%rbp)
    L8:
            leave
            ret
    

    The subtraction of 32 was carried out in the %eax register. And in the x86 calling convention, that is the register in which the return value is expected to be! So... you got lucky.

    But please pay attention to the warnings. They are there for a reason!

    0 讨论(0)
  • 2020-12-02 00:45

    Essentially, c is pushed into the spot that should later be filled with the return value; since it's not overwritten by use of return, it ends up as the value returned.

    Note that relying on this (in C, or any other language where this isn't an explicit language feature, like Perl), is a Bad Idea™. In the extreme.

    0 讨论(0)
  • 2020-12-02 00:47

    There are no local variables, so the value on the top of the stack at the end of the function will be the parameter c. The value at the top of the stack upon exiting, is the return value. So whatever c holds, that's the return value.

    0 讨论(0)
  • 2020-12-02 00:48

    It depends on the Application Binary Interface and which registers are used for the computation.

    E.g. on x86, the first function parameter and the return value is stored in EAX and so gcc is most likely using this to store the result of the calculation as well.

    0 讨论(0)
  • 2020-12-02 00:48

    I can't tell you the specifics of your platform as I don't know it but there is a general answer to the behaviour you see.

    When the some function that has a return is compiled, the compiler will use a convention on how to return that data. It could be a machine register, or a defined memory location such as via a stack or whatever (though generally machine registers are used). The compiled code may also use that location (register or otherwise) while doing the work of the function.

    If the function doesn't return anything, then the compiler will not generate code that explicitly fills that location with a return value. However like I said above it may use that location during the function. When you write code that reads the return value (ch2 = toUpper(ch);), the compiler will write code that uses its convention on how retrieve that return from the conventional location. As far as the caller code is concerned it will just read that value from the location, even if nothing was written explicitly there. Hence you get a value.

    Now look at @Ray's example, the compiler used the EAX register, to store the results of the upper casing operation. It just so happens, this is probably the location that return values are written to. On the calling side ch2 is loaded with the value that's in EAX - hence a phantom return. This is only true of the x86 range of processors, as on other architectures the compiler may use a completely different scheme in deciding how the convention should be organised

    However good compilers will try optimise according to set of local conditions, knowledge of code, rules, and heuristics. So an important thing to note is that this is just luck that it works. The compiler could optimise and not do this or whatever - you should not reply on the behaviour.

    0 讨论(0)
提交回复
热议问题