gcc / ld: overlapping sections (.tbss, .init_array) in statically-linked ELF binary

大城市里の小女人 提交于 2019-12-23 14:59:28

问题


I'm compiling a very simple hello-world one-liner statically on Debian 7 system on x86_64 machine with gcc version 4.8.2 (Debian 4.8.2-21):

gcc test.c -static -o test

and I get an executable ELF file that includes the following sections:

[17] .tdata            PROGBITS         00000000006b4000  000b4000
     0000000000000020  0000000000000000 WAT       0     0     8
[18] .tbss             NOBITS           00000000006b4020  000b4020
     0000000000000030  0000000000000000 WAT       0     0     8
[19] .init_array       INIT_ARRAY       00000000006b4020  000b4020
     0000000000000010  0000000000000000  WA       0     0     8
[20] .fini_array       FINI_ARRAY       00000000006b4030  000b4030
     0000000000000010  0000000000000000  WA       0     0     8
[21] .jcr              PROGBITS         00000000006b4040  000b4040
     0000000000000008  0000000000000000  WA       0     0     8
[22] .data.rel.ro      PROGBITS         00000000006b4060  000b4060
     00000000000000e4  0000000000000000  WA       0     0     32

Note that .tbss section is allocated at addresses 0x6b4020..0x6b4050 (0x30 bytes) and it intersects with allocation of .init_array section at 0x6b4020..0x6b4030 (0x10 bytes), .fini_array section at 0x6b4030..0x6b4040 (0x10 bytes) and with .jcr section at 0x6b4040..0x6b4048 (8 bytes).

Note it does not intersect with the following sections, for example, .data.rel.ro, but that's probably because .data.rel.ro alignment is 32 and thus it can't be placed any earlier than 0x6b4060.

The resulting file runs ok, but I still don't exactly get how it works. From what I read in glibc documentation, .tbss is a just .bss section for thread local storage (i.e. allocated memory scratch space, not really mapped in physical file). Is it that .tbss section is so special that it can overlap other sections? Are .init_array, .fini_array and .jcr are so useless (for example, they are not needed anymore then TLS-related code runs), so they can be overwritten by bss? Or is it some sort of a bug?

Basically, what do I get to read and write if I'll try to read address 0x6b4020 in my application? .tbss contents or .init_array pointers? Why?


回答1:


The virtual address of .tbss is meaningless as that section only serves as a template for the TLS storage as allocated by the threading implementation in GLIBC.

The way this virtual address comes into place is that .tbss follows .tbdata in the default linker script:

...
.gcc_except_table   : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) }
/* Thread Local Storage sections  */
.tdata          : { *(.tdata .tdata.* .gnu.linkonce.td.*) }
.tbss           : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) }
.preinit_array     :
{
  PROVIDE_HIDDEN (__preinit_array_start = .);
  KEEP (*(.preinit_array))
  PROVIDE_HIDDEN (__preinit_array_end = .);
}
.init_array     :
{
   PROVIDE_HIDDEN (__init_array_start = .);
   KEEP (*(SORT(.init_array.*)))
   KEEP (*(.init_array))
   PROVIDE_HIDDEN (__init_array_end = .);
}
...

therefore its virtual address is simply the virtual address of the preceding section (.tbdata) plus the size of the preceding section (eventually with some padding in order to reach the desired alignment). .init_array (or .preinit_array if present) comes next and its location should be determined the same way, but .tbss is known to be so very special, that it is given a deeply hard-coded treatment inside GNU LD:

/* .tbss sections effectively have zero size.  */
if ((os->bfd_section->flags & SEC_HAS_CONTENTS) != 0
    || (os->bfd_section->flags & SEC_THREAD_LOCAL) == 0
    || link_info.relocatable)
  dotdelta = TO_ADDR (os->bfd_section->size);
else
  dotdelta = 0;    // <----------------
dot += dotdelta;

.tbss is not relocatable, it has the SEC_THREAD_LOCAL flag set, and it does not have contents (NOBITS), therefore the else branch is taken. In other words, no matter how large the .tbss is, the linker does not advance the location of the section that follows it (also know as "the dot").

Note also that .tbss sits in a non-loadable ELF segment:

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x00000000000b1f24 0x00000000000b1f24  R E    200000
  LOAD           0x00000000000b2000 0x00000000006b2000 0x00000000006b2000
                 0x0000000000002288 0x00000000000174d8  RW     200000
  NOTE           0x0000000000000158 0x0000000000400158 0x0000000000400158
                 0x0000000000000044 0x0000000000000044  R      4
  TLS            0x00000000000b2000 0x00000000006b2000 0x00000000006b2000 <---+
                 0x0000000000000020 0x0000000000000060  R      8              |
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000     |
                 0x0000000000000000 0x0000000000000000  RW     8              |
                                                                              |
 Section to Segment mapping:                                                  |
  Segment Sections...                                                         |
   00     .note.ABI-tag ...                                                   |
   01     .tdata .ctors ...                                                   |
   02     .note.ABI-tag ...                                                   |
   03     .tdata .tbss    <---------------------------------------------------+
   04



回答2:


This is rather simple if you have an understanding about two things:

1) What is SHT_NOBITS

2) What is tbss section

SHT_NOBITS means that this section occupies no space inside file.

Normally, NOBITS sections, like bss are placed after all PROGBITS sections at the end of the loaded segments.

tbss is special section to hold uninitialized thread-local data that contribute to the program's memory image. Take an attention here: this section must hold unique data for each program thread.

Now lets talk about overlapping. We have two possible overlappings -- inside binary file and inside memory.

1) Binary files offset:

There is no data to write under this section in binary. Inside file it holds no space, so linker start next section init_array immediately after tbss declared. You may think about its size not as about size, but as about special service information for code like:

if (isTLSSegment) tlsStartAddr += section->memSize();

So it doesn't overlap anything inside file.

2) Memory offset

The tdata and tbss sections may be possibly modified at startup time by the dynamic linker performing relocations, but after that the section data is kept around as the initialization image and not modified anymore. For each thread, including the initial one, new memory is allocated into which then the content of the initialization image is copied. This ensures that all threads get the same starting conditions.

This what makes tbss (and tdata) so special.

Do not think about their memory offsets as about statically known -- they are more like "generation patterns" for per-thread work. So they also can not overlap with "normal" memory offsets -- they are being processed in other way.

You may consult with this paper to know more.



来源:https://stackoverflow.com/questions/25501044/gcc-ld-overlapping-sections-tbss-init-array-in-statically-linked-elf-bin

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!