Does the C startup code change addresses of data

问题

In terms of embedded development using C code, I understand that when a program is compiled and linked, a binary or ELF file is produced which executes on the target. The ELF file will contain (along with alot of other stuff ) the addresses or address offsets of global variables.

Now, when the C startup code executes first, it can copy non-const data / variables from flash memory into RAM if this data is to be modified throughout the program.

This will then change the memory addresses of the variables on the system? Does this then update the ELF file to change the address of this data?

回答1:

_{Note: the following describes the behaviour of a standalone (or bare metal) embedded environment, and is not necessarily universally true of all environments. Hosted systems that load code dynamically from mass storage for example differ significantly - but not so much that this does not remain a valid answer even in that case (i.e. No addresses are "changed") ...}

No addresses are "changed". In a "typical" standalone environment with code running from ROM (not all embedded systems are organised that way, but that is perhaps suggested by your question), the linker locates all symbols at specific locations. If those locations are in RAM and have initial values, the initialisation data - not the variables, are stored in ROM and are copied to RAM at start up - the variable locations have not changed - they were always in RAM, and the initialisation data always in ROM.

You can see this in the link map that your linker is almost certainly capable of producing.

Address data in the object code - be it ELF, Intel Hex or whatever, only has a bearing when the code is loaded - the raw binary contains no address information - all objects are located by the linker and placed at the required location by the loader (be that a flash programmer, debugger or a bootloader) using the address information therein. A raw binary file has no address information at all, and you must specify the load address when you load it, and any "gaps" must be filled with padding. If the code is not position independent (i.e. all address references relative), then a raw binary must be loaded to the correct specific address or it will not run as intended.

When a debugger loads an ELF file, it loads the binary to the target and reads the address and symbolic information into the debugger running in the host. The address and symbol information is implicit in the code and not explicitly present in the target (unless you have target based debug facilities that make use of such information perhaps).

The C runtime start-up performs the following steps (at least):

Stack pointer initialisation
Zero initialisation of static data without an explicit non-zero initialisation (Data segment)
Initialisation of explicitly initialised non-zero static data (BSS segment)
Runtime library initialisation (such as heap initialisation)
Jump to main.

For systems running code from RAM, there would also be a step where that code is copied from ROM to RAM (in some cases including image decompression), and if the runtime supports C++, there will be a step that calls the constructors of all static objects. Often this run-time start-up code is provided by your toolchain, and you may never take any notice of it, but it will almost certainly be available to you as source code (in assembler and/or C) for you to customise or just see how it works.

The data and BSS segments are each normally a single contiguous block, the zero initialisation is simply a block memory write to zero, and the data initialisation is simply a memory copy (possibly with decompression) of a block from ROM to RAM.

The code segment - whether in ROM or RAM is called the text segment.

So you have two memory maps - the image memory map (which is ROMable) containing the text, constant data, initialisation data and bootstrap or start up code, and a run-time memory map containing the text, constant data, data, BSS, stack and heap segments. The data, BSS, stack and heap are necessarily in RAM, the text and constant data may be in ROM or RAM depending on the runtime architecture of the system.

Ref: https://en.wikipedia.org/wiki/Data_segment

回答2:

Now, when the C startup code executes first, it can copy non-const data / variables from flash memory into RAM if this data is to be modified throughout the program.

This will then change the memory addresses of the variables on the system? Does this then update the ELF file to change the address of this data?

All references to your variables throughout the program code will use the addresses from RAM, sections .bss and .data. The ROM constants used for initialization, stored in .rodata, will have some other address that only the start-up code ("CRT") is concerned about. This will work the same no matter if ELF or the final binary.

See this post on the EE site for further info: What resides in the different memory types of a microcontroller?

来源：https://stackoverflow.com/questions/58241883/does-the-c-startup-code-change-addresses-of-data

标签

linker

embedded

startup

elf