问题
As a concrete example, on GAS 2.24, moving the address:
mov $s, %eax
s:
After:
as --64 -o a.o a.S
objdump -Sr a.o
Uses zero extension:
0000000000000000 <s-0x5>:
0: b8 00 00 00 00 mov $0x0,%eax
1: R_X86_64_32 .text+0x5
But memory access:
mov s, %eax
s:
Compiles to sign extension:
0000000000000000 <s-0x7>:
0: 8b 04 25 00 00 00 00 mov 0x0,%eax
3: R_X86_64_32S .text+0x7
Is there a rationale to using either in this specific case, or in general? I don't understand how the assembler could to any better supposition about either case.
NASM 2.10.09 just uses R_X86_64_32
for both of the above. Update: an edge nasm commit 6377180 after 2.11 produces the same output of Gas, which seemed like a bug as Ross mentioned.
I have explained what I think I understand about R_X86_64_32S
at: https://stackoverflow.com/a/33289761/895245
回答1:
The difference is in the allowed addresses for the symbol s
. In the first case with R_X86_64_32, the symbol must be in the range of 0x00000000'00000000 to 0x00000000'FFFFFFFF. In the second case with R_X86_64_32S, the address of the symbol must between 0xFFFFFFFF'80000000 and 0x00000000'7FFFFFFF. If s
ends up with an address outside of these ranges then linker will give an error.
This corresponds to how the CPU interprets the 32-bit value of s
encoded into the two instructions. In the first instruction, where it's an immediate operand, the 32-bit value is zero extended into RAX. In the second instruction the 32-bit value is a displacement in a memory operand, and so is sign extended to form a 64-bit address.
NASM shouldn't be using the unsigned R_X86_64_32 relocation for the second instruction. It's not question of which one is better, using R_X86_64_32 here is simply incorrect. NASM would permit the address of s
to be 0x00000000'80000000, but CPU would end up accessing 0xFFFFFFFF'80000000 instead.
回答2:
With the immediate-data mov, the assembler is just doing what you wrote. Writing to a 32bit register always zero-extends the upper32 in x86-64. As documented in the Intel insn ref manual:
MOV r/m64, imm32
means: Moveimm32
sign extended to 64-bits tor/m64
.MOV r/m32, imm32
means: Moveimm32
tor/m32
.
If you wanted sign-extension to match how 32bit addresses are treated in 32bit-absolute addressing modes, you should have written
mov $s, %rax
32bit displacements are always sign-extended. So I think Ross's answer is right, that NASM 2.10.09 is buggy. It's apparently telling the linker that the address will be zero-extended, when in fact it will be sign-extended. Of course, RIP-relative addressing takes fewer instruction bytes, so it should be preferred over absolute addressing when possible.
来源:https://stackoverflow.com/questions/33318342/when-is-it-better-for-an-assembler-to-use-sign-extended-relocation-like-r-x86-64