Basic OS X Assembly and the Mach-O format

六月ゝ 毕业季﹏ 提交于 2019-11-29 00:39:34

Firstly, why is so much DWARF debug info included in the Release build of a C executable?

Being able to debug optimized code is incredibly useful. Cases in which bugs are only visible in optimized builds are not rare. If you're hand writing assembly you're unlikely to care about DWARF information though, so I'd suggest building your comparison code without the -g argument.


Secondly, I notice the Mach-O header data is included 4 times (in different DWARF-related sections). Why is this necessary?

These aren't Mach-O headers that you're seeing. They're the headers for DWARF accelerator tables, an LLVM extension to DWARF that optimizes the test for whether a symbol is defined within a given compilation unit.


But in the 248B program, these are all nowhere to be seen - the program instead begins at _start. How is that possible if all programs by definition begin in main?

Historically on OS X all programs begin at start. However, this symbol typically comes from a system library rather than being defined by the program itself. The system implementation of start will perform some initialization and then jump to your programs "real" entry point.

The entry points to Mach-O binaries is defined by either the LC_UNIXTHREAD or LC_MAIN load commands. When LC_UNIXTHREAD, the convention for pre-10.8 versions of OS X, is used with a regular C or C++ program the linker uses start as the entry point. This symbol typically comes from /usr/lib/crt1.o, and its address is written in to the instruction pointer field of the LC_UNIXTHREAD load command. The 248B binary you link to includes an LC_UNIXTHREAD command with eip set to 0x000010e8. That's the address of the symbol _start. Since this small program is a static executable and the binary is generated directly it can write whatever address it wishes to in to the instruction pointer field of the load command.

If you're building your executable targeting OS X 10.8+ the linker will generate an LC_MAIN load command instead of LC_UNIXTHREAD. The kernel knows that binaries using the LC_MAIN command should be executed by loading the dynamic linker and jumping to its entry point. The dynamic linker, dyld, initializes itself and then jumps to the address specified in the LC_MAIN command. In this brave new world no symbol named start is used at all.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!