Why is no value returned if a function does not explicity use 'ret'

前端 未结 2 2001
离开以前
离开以前 2020-12-07 04:00

I have the following program:

SECTION .text
main:
     mov ebx, 10
     mov ecx, 50

repeat:
     inc ebx
     loop repeat

     mov eax, ebx
     ret
         


        
2条回答
  •  甜味超标
    2020-12-07 04:40

    Because it falls through and runs the next function the linker put after it.

    See my comments on Ira's answer for why your code didn't just crash. If you weren't linking with the C runtime library startup code (i.e. you just had _start instead of main), execution would hit some non-code, and either fault on illegal instruction, or try to access unmapped memory. See below.

    Disassemble your final binary to see what happened. When I tried this, I found that the linker put main between the standard C runtime startup functions frame_dummy and __libc_csu_init. It

    00000000004004f6 
    : 4004f6: b8 0a 00 00 00 mov $0xa,%eax 4004fb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 0000000000400500 <__libc_csu_init>: 400500: 41 57 push %r15 400502: 41 56 push %r14 400504: 41 89 ff mov %edi,%r15d 400507: 41 55 push %r13 ... a bunch more code that eventually returns.

    You could have found out what happens with a debugger, single-stepping instructions.


    BTW, if you did make a freestanding binary, either with gcc -static -nostartfiles or by assembling (as foo.s) / linking (ld foo.o) yourself, you'd get an 888 byte file holding your one instruction, with the rest being ELF headers and stuff.

    $ cat > fallthrough.s < 0x00000000004000d4 : b8 0a 00 00 00  mov    $0xa,%eax
       0x00000000004000d9:  00 00   add    %al,(%rax)
       0x00000000004000db:  00 00   add    %al,(%rax)
       0x00000000004000dd:  00 00   add    %al,(%rax)
       0x00000000004000df:  00 2c 00        add    %ch,(%rax,%rax,1)
       0x00000000004000e2:  00 00   add    %al,(%rax)
       0x00000000004000e4:  02 00   add    (%rax),%al
       0x00000000004000e6:  00 00   add    %al,(%rax)
       0x00000000004000e8:  00 00   add    %al,(%rax)
       0x00000000004000ea:  08 00   or     %al,(%rax)
       0x00000000004000ec:  00 00   add    %al,(%rax)
       0x00000000004000ee:  00 00   add    %al,(%rax)
       0x00000000004000f0:  d4      (bad)  
       0x00000000004000f1:  00 40 00        add    %al,0x0(%rax)
       ...
    (gdb) layout asm  #text-window mode. layout reg is great for single-stepping, BTW.
    (gdb) si   # step instruction
    0x00000000004000d9 in ?? ()
    (gdb) si
    Program received signal SIGSEGV, Segmentation fault.
    0x00000000004000d9 in ?? ()
    (gdb) c
    Continuing.
    
    Program terminated with signal SIGSEGV, Segmentation fault.
    The program no longer exists.
    

    The 00 bytes that follow your code in memory are there in the ELF executable, too. memory-mapping of files only happens with page granularity, so it all ends up mapped as executable instructions. (machine code isn't copied out of the disk cache for executables; the memory is just mapped with read+execute permission into the process that execve(2)s the binary.)

    $ objdump -s a.out
    a.out:     file format elf64-x86-64
    
    Contents of section .note.gnu.build-id:
     4000b0 04000000 14000000 03000000 474e5500  ............GNU.
     4000c0 db31c97d 55481b9a 57110753 1786dd1a  .1.}UH..W..S....
     4000d0 11679958                             .g.X
    Contents of section .text:
     4000d4 b80a0000 00                          .....
    Contents of section .debug_aranges:
     0000 2c000000 02000000 00000800 00000000  ,...............
     0010 d4004000 00000000 05000000 00000000  ..@.............
     0020 00000000 00000000 00000000 00000000  ................
     ...
    
    $ size a.out
       text    data     bss     dec     hex filename
         41       0       0      41      29 a.out
    

    Stripping the binary still makes it segfault, but with a different instruction. Yay?

    # b _start would be b *0x4000d4 without symbols.
    (gdb) r
     ...
    Program received signal SIGSEGV, Segmentation fault.
    0x00000000004000d9 in ?? ()
    (gdb) disassemble /r $rip-5, $rip +15
    Dump of assembler code from 0x4000d4 to 0x4000e8:
       0x00000000004000d4:  b8 0a 00 00 00  mov    $0xa,%eax
    => 0x00000000004000d9:  00 2e   add    %ch,(%rsi)
       0x00000000004000db:  73 68   jae    0x400145
       0x00000000004000dd:  73 74   jae    0x400153
       0x00000000004000df:  72 74   jb     0x400155
       0x00000000004000e1:  61      (bad)  
       0x00000000004000e2:  62      (bad)  
       0x00000000004000e3:  00 2e   add    %ch,(%rsi)
       0x00000000004000e5:  6e      outsb  %ds:(%rsi),(%dx)
       0x00000000004000e6:  6f      outsl  %ds:(%rsi),(%dx)
       0x00000000004000e7:  74 65   je     0x40014e
    
    $ hexdump -C a.out
     ...
    000000d0  11 67 99 58 b8 0a 00 00  00 00 2e 73 68 73 74 72  |.g.X.......shstr|
    000000e0  74 61 62 00 2e 6e 6f 74  65 2e 67 6e 75 2e 62 75  |tab..note.gnu.bu|
    000000f0  69 6c 64 2d 69 64 00 2e  74 65 78 74 00 00 00 00  |ild-id..text....|
    00000100  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
    

    Our mov instruction is the b8 0a 00 00 00 in the first line I included from the hexdump. I think the following 00 2e ... is an ELF data structure, probably an index of sections or something. As an x86 instruction, it's an add %ch,(%rsi), which segfaults because %rsi isn't pointing to writeable memory. (The ABI says registers other than the stack pointer are undefined on process entry, but Linux chooses to zero them in the ELF loader to avoid leaking kernel data. %rsi doesn't point to writeable memory, and the process probably doesn't have any.)


    So what if you added a return here? Nope, there's nothing to return to. The stack contains pointers to the process args environment variables. You have to make an exit system call.

    .section .text
    .globl _start
    _start:
            xor %edi, %edi
            mov $231, %eax  #  exit(0)
            syscall
    
    #       movl $1, %eax    # The 32bit ABI works even for processes in long mode, BTW.
    #       int $0x80        # exit(edx)
    

提交回复
热议问题