About returning more than one value in C/C++/Assembly

后端 未结 3 1123
囚心锁ツ
囚心锁ツ 2020-12-18 09:18

I have read some questions about returning more than one value such as What is the reason behind having only one return value in C++ and Java?, Returning multiple values fro

3条回答
  •  暗喜
    暗喜 (楼主)
    2020-12-18 09:49

    Returning in stack isn't necessarily slower, because once the values are available in L1 cache (which the stack often fulfills), accessing them will be very fast.

    However in most computer architectures there are at least 2 registers to return values that are twice (or more) as wide as the word size (edx:eax in x86, rdx:rax in x86_64, $v0 and $v1 in MIPS (Why MIPS assembler has more that one register for return value?), R0:R3 in ARM1, X0:X7 in ARM64...). The ones that don't have are mostly microcontrollers with only one accumulator or a very limited number of registers.

    1"If the type of value returned is too large to fit in r0 to r3, or whose size cannot be determined statically at compile time, then the caller must allocate space for that value at run time, and pass a pointer to that space in r0."

    These registers can also be used for returning directly small structs that fits in 2 (or more depending on architecture and ABI) registers or less.

    For example with the following code

    struct Point
    {
        int x, y;
    };
    
    struct shortPoint
    {
        short x, y;
    };
    
    struct Point3D
    {
        int x, y, z;
    };
    
    Point P1()
    {
        Point p;
        p.x = 1;
        p.y = 2;
        return p;
    }
    
    Point P2()
    {
        Point p;
        p.x = 1;
        p.y = 0;
        return p;
    }
    
    shortPoint P3()
    {
        shortPoint p;
        p.x = 1;
        p.y = 0;
        return p;
    }
    
    Point3D P4()
    {
        Point3D p;
        p.x = 1;
        p.y = 2;
        p.z = 3;
        return p;
    }
    

    Clang emits the following instructions for x86_64 as you can see here

    P1():                                 # @P1()
        movabs  rax, 8589934593
        ret
    
    P2():                                 # @P2()
        mov eax, 1
        ret
    
    P3():                                 # @P3()
        mov eax, 1
        ret
    
    P4():                                 # @P4()
        movabs  rax, 8589934593
        mov edx, 3
        ret
    

    For ARM64:

    P1():
        mov x0, 1
        orr x0, x0, 8589934592
        ret
    P2():
        mov x0, 1
        ret
    P3():
        mov w0, 1
        ret
    P4():
        mov x1, 1
        mov x0, 0
        sub sp, sp, #16
        bfi x0, x1, 0, 32
        mov x1, 2
        bfi x0, x1, 32, 32
        add sp, sp, 16
        mov x1, 3
        ret
    

    As you can see, no stack operations are involved. You can switch to other compilers to see that the values are mainly returned on registers.

提交回复
热议问题