How can gcc/clang assume a string constant's address is 32-bit?

▼魔方 西西 提交于 2019-11-29 10:57:20

In GCC manual:

https://gcc.gnu.org/onlinedocs/gcc-4.5.3/gcc/i386-and-x86_002d64-Options.html

3.17.15 Intel 386 and AMD x86-64 Options

-mcmodel=small

Generate code for the small code model: the program and its symbols must be linked in the lower 2 GB of the address space. Pointers are 64 bits. Programs can be statically or dynamically linked. This is the default code model.

-mcmodel=kernel Generate code for the kernel code model. The kernel runs in the negative 2 GB of the address space. This model has to be used for Linux kernel code.

-mcmodel=medium

Generate code for the medium model: The program is linked in the lower 2 GB of the address space. Small symbols are also placed there. Symbols with sizes larger than -mlarge-data-threshold are put into large data or bss sections and can be located above 2GB. Programs can be statically or dynamically linked.

-mcmodel=large

Generate code for the large model: This model makes no assumptions about addresses and sizes of sections.


https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html

3.18.1 AArch64 Options

-mcmodel=tiny

Generate code for the tiny code model. The program and its statically defined symbols must be within 1GB of each other. Pointers are 64 bits. Programs can be statically or dynamically linked. This model is not fully implemented and mostly treated as ‘small’.

-mcmodel=small

Generate code for the small code model. The program and its statically defined symbols must be within 4GB of each other. Pointers are 64 bits. Programs can be statically or dynamically linked. This is the default code model.

-mcmodel=large

Generate code for the large code model. This makes no assumptions about addresses and sizes of sections. Pointers are 64 bits. Programs can be statically linked only.

Antti Haapala

I can confirm that this happens on 64-bit compilation:

gcc -O1 foo.c

Then objdump -d a.out (notice also that printf("%s\n") can be optimized into puts!):

0000000000400536 <main>:
  400536:       48 83 ec 08             sub    $0x8,%rsp
  40053a:       bf d4 05 40 00          mov    $0x4005d4,%edi
  40053f:       e8 cc fe ff ff          callq  400410 <puts@plt>
  400544:       b8 00 00 00 00          mov    $0x0,%eax
  400549:       48 83 c4 08             add    $0x8,%rsp
  40054d:       c3                      retq   
  40054e:       66 90                   xchg   %ax,%ax

The reason is that GCC defaults to -mcmodel=small where the static data is linked in the bottom 2G of address space.


Notice that string constants do not go to the data segment, but they're within the code segment instead, unless -fwritable-strings. Also if you want to relocate the object code freely in memory, you'd probably want to compile with -fpic to make the code RIP relative instead of putting 64-bit addresses everywhere.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!