Why does x86 program segfault without .data section? [duplicate]

问题

I'm making a basic assembly subtraction function and printing the result to the console. Here's the code I think SHOULD work: (compiled with as output.s , ld a.out -e _start -o output)

    .bss
output:
    .int

    .text
    .global _start

_start: 

movl $9, %eax
movl %eax, %ebx
movl $8, %eax
subl %eax, %ebx

movl %ebx, (output)

# ASCII for digits is 0x30 greater than digit value
addl    $0x30, output

movl    $2, %edx        # write 2 bytes (need 1 for null?)
movl    $output, %ecx   # output
movl    $1, %ebx        # write to stdin
movl    $4, %eax        # syscall number for write
int $0x80               # invoke syscall

# CR
movl    $2, %edx
movl    $13, (output)
movl    $output, %ecx
movl    $1, %ebx
movl    $4, %eax
int $0x80

# LF
movl    $2, %edx
movl    $10, (output)
movl    $output, %ecx
movl    $1, %ebx
movl    $4, %eax
int $0x80

# exit
movl    $0, %ebx
movl    $1, %eax
int $0x80

However, this program segfaults. I found that if I add a trivial .data section at the end:

    .data   
pingle:
    .int 666

it works fine. Why do I need the .data segment? Am I overflowing one of the segments when I write 2 bytes each time? Or is overwriting output several times doing this?

Any ideas are much appreciated!

回答1:

.int with an empty list reserves no space. Your program doesn't have a BSS. .int 0 should work, but using directives that only reserve space is more idiomatic:

Use .space 4 in the BSS section to reserve 4 bytes. Or use .comm output 4 to reserve 4B in the BSS without using a .bss directive first. .int 0 should work, too, but using directives that only reserve space is more idiomatic.

See also the gas manual and the x86 tag wiki.

IIRC, the BSS can end up being in the same page as the data segment, and memory access checking has page granularity. This explains why loading/storing from/to (output) happens to work, even though it's past the end of the BSS.

An example

 ## nobss.S
.bss
.globl output       # put this symbol 
output: .int

.text
.globl _start
_start:
    mov (output), %eax

$ gcc -g -nostdlib nobss.S
$ nm -n  ./a.out            # nm -a -n  to also include debug syms, but gas doesn't make debug info automatically (unlike NASM/YASM)
00000000004000d4 T _start
00000000006000db T __bss_start
00000000006000db T _edata
00000000006000db T output
00000000006000db T end_of_bss       # same address as output, proving that .int reserved no space.
00000000006000e0 T _end

$ gdb ./a.out
(gdb) b _start
(gdb) r
 # a.out is now running, but stopped before the first instruction

# Then, in another terminal:
$ less /proc/$(pidof a.out)/maps 
00400000-00401000 r-xp 00000000 09:7e 9527300                            /home/peter/src/SO/a.out
7ffff7ffb000-7ffff7ffd000 r--p 00000000 00:00 0                          [vvar]
7ffff7ffd000-7ffff7fff000 r-xp 00000000 00:00 0                          [vdso]
7ffffffdd000-7ffffffff000 rwxp 00000000 00:00 0                          [stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

Note the absence of any anonymous mappings that could be the BSS, or any writable mapping of a.out (data). Only our program text is mapped. (With a private mapping, but it's still actually copy-on-write shared.) See this answer for what the fields mean.

Terminating 0 bytes in read and write aren't needed

movl    $2, %edx        # write 2 bytes (need 1 for null?)

read and write system calls take explicit lengths. You don't need to (and shouldn't) include a terminating zero byte in the length you pass to write(). For example,

# You want this
$ strace echo foo > /dev/null
...
write(1, "foo\n", 4)                    = 4
...

# not this:
$ strace printf 'foo\n\0' > /dev/null

...
write(1, "foo\n\0", 5)                  = 5
...

来源：https://stackoverflow.com/questions/39356236/why-does-x86-program-segfault-without-data-section

标签

assembly

x86

segmentation-fault