问题
I tried to compile and convert a very simple C program to assembly language.
I am using Ubuntu and the OS type is 64 bit.
This is the C Program.
void add();
int main() {
add();
return 0;
}
if i use gcc -S -m32 -fno-asynchronous-unwind-tables -o simple.S simple.c this is how my assembly source code File should look like:
.file "main1.c"
.text
.globl main
.type main, @function
main:
pushl %ebp
movl %esp, %ebp
andl $-16, %esp
call add
movl $0, %eax
movl %ebp, %esp
popl %ebp
ret
.size main, .-main
.ident "GCC: (Debian 4.4.5-8) 4.4.5" // this part should say Ubuntu instead of Debian
.section .note.GNU-stack,"",@progbits
but instead it looks like this:
.file "main0.c"
.text
.globl main
.type main, @function
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ebx
pushl %ecx
call __x86.get_pc_thunk.ax
addl $_GLOBAL_OFFSET_TABLE_, %eax
movl %eax, %ebx
call add@PLT
movl $0, %eax
popl %ecx
popl %ebx
popl %ebp
leal -4(%ecx), %esp
ret
.size main, .-main
.section
.text.__x86.get_pc_thunk.ax,"axG",@progbits,__x86.get_pc_thunk.ax,comdat
.globl __x86.get_pc_thunk.ax
.hidden __x86.get_pc_thunk.ax
.type __x86.get_pc_thunk.ax, @function
__x86.get_pc_thunk.ax:
movl (%esp), %eax
ret
.ident "GCC: (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406"
.section .note.GNU-stack,"",@progbits
At my University they told me to use the Flag -m32 if I am using a 64 bit Linux version. Can somebody tell me what I am doing wrong? Am I even using the correct Flag?
edit after -fno-pie
.file "main0.c"
.text
.globl main
.type main, @function
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ecx
subl $4, %esp
call add
movl $0, %eax
addl $4, %esp
popl %ecx
popl %ebp
leal -4(%ecx), %esp
ret
.size main, .-main
.ident "GCC: (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406"
.section .note.GNU-stack,"",@progbits
it looks better but it's not exactly the same. for example what does leal mean?
回答1:
As a general rule, you cannot expect two different compilers to generate the same assembly code for the same input, even if they have the same version number; they could have any number of extra "patches" to their code generation. As long as the observable behavior is the same, anything goes.
You should also know that GCC, in its default -O0
mode, generates intentionally bad code. It's tuned for ease of debugging and speed of compilation, not for either clarity or efficiency of the generated code. It is often easier to understand the code generated by gcc -O1
than the code generated by gcc -O0
.
You should also know that the main
function often needs to do extra setup and teardown that other functions do not need to do. The instruction leal 4(%esp),%ecx
is part of that extra setup. If you only want to understand the machine code corresponding to the code you wrote, and not the nitty details of the ABI, name your test function something other than main
.
(As pointed out in the comments, that setup code is not as tightly tuned as it could be, but it doesn't normally matter, because it's only executed once in the lifetime of the program.)
Now, to answer the question that was literally asked, the reason for the appearance of
call __x86.get_pc_thunk.ax
is because your compiler defaults to generating "position-independent" executables. Position-independent means the operating system can load the program's machine code at any address in (virtual) memory and it'll still work. This allows things like address space layout randomization, but to make it work, you have to take special steps to set up a "global pointer" at the beginning of every function that accesses global variables or calls another function (with some exceptions). It's actually easier to explain the code that's generated if you turn optimization on:
main:
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
movl %esp, %ebp
pushl %ebx
pushl %ecx
This is all just setting up main
's stack frame and saving registers that need to be saved. You can ignore it.
call __x86.get_pc_thunk.bx
addl $_GLOBAL_OFFSET_TABLE_, %ebx
The special function __x86.get_pc_thunk.bx
loads its return address -- which is the address of the addl
instruction that immediately follows -- into the EBX register. Then we add to that address the value of the magic constant _GLOBAL_OFFSET_TABLE_
, which, in position-independent code, is the difference between the address of the instruction that uses _GLOBAL_OFFSET_TABLE_
and the address of the global offset table. Thus, EBX now points to the global offset table.
call add@PLT
Now we call add@PLT
, which means call add
, but jump through the "procedure linkage table" to do it. The PLT takes care of the possibility that add
is defined in a shared library rather than the main executable. The code in the PLT uses the global offset table and assumes that you have already set EBX to point to it, before calling an @PLT symbol. That's why main
has to set up EBX even though nothing appears to use it. If you had instead written something like
extern int number;
int main(void) { return number; }
then you would see a direct use of the GOT, something like
call __x86.get_pc_thunk.bx
addl $_GLOBAL_OFFSET_TABLE_, %ebx
movl number@GOT(%ebx), %eax
movl (%eax), %eax
We load up EBX with the address of the GOT, then we can load the address of the global variable number
from the GOT, and then we actually dereference the address to get the value of number
.
If you compile 64-bit code instead, you'll see something different and much simpler:
movl number(%rip), %eax
Instead of all this mucking around with the GOT, we can just load number
from a fixed offset from the program counter. PC-relative addressing was added along with the 64-bit extensions to the x86 architecture. Similarly, your original program, in 64-bit position-independent mode, will just say
call add@PLT
without setting up EBX first. The call still has to go through the PLT, but the PLT uses PC-relative addressing itself and doesn't need any help from its caller.
The only difference between __x86.get_pc_thunk.bx
and __x86.get_pc_thunk.ax
is which register they store their return address in: EBX for .bx
, EAX for .ax
. I have also seen GCC generate .cx
and .dx
variants. It's just a matter of which register it wants to use for the global pointer -- it must be EBX if there are going to be calls through the PLT, but if there aren't any then it can use any register, so it tries to pick one that isn't needed for anything else.
Why does it call a function to get the return address? Older compilers would do this instead:
call 1f
1: pop %ebx
but that screws up return-address prediction, so nowadays the compiler goes to a little extra trouble to make sure every call
is paired with a ret
.
回答2:
The extra junk you're seeing is due to your version of GCC special-casing main
to compensate for possibly-broken entry point code starting it with a misaligned stack. I'm not sure how to disable this or if it's even possible, but renaming the function to something other than main will suppress it for the sake of your reading.
After renaming to xmain
I get:
xmain:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
call add
movl $0, %eax
leave
ret
来源:https://stackoverflow.com/questions/50105581/how-do-i-get-rid-of-call-x86-get-pc-thunk-ax