x86-64 instruction set, AT&T syntax, confusion regarding lea and brackets

问题

I’ve been told that lea %rax, %rdx is invalid syntax as the source needs to be in brackets, i.e lea (%rax), %rdx

I think I’ve clearly misunderstood both lea and the purpose of brackets.

I thought that lea %rax, %rdx would move the memory address stored in %rax, to %rdx, but apparently this is what lea (%rax), %rdx does?

What confuses me is that I thought brackets signify going to an address in memory, and taking the value at that address. So by using brackets lea would be moving a value from the memory address stored in %rax into the destination register.

Hence why I thought lea %rax, %rdx would be used if you just wanted to move the address stored in %rax into %rdx

Could someone explain to me the significance of brackets in the case of the lea instruction?

回答1:

Never actually use lea (%rax), %rdx. Use mov %rax, %rdx instead because CPUs run it more efficiently, and both ways copy a register value (regardless of whether that value is a valid pointer or not).

LEA can only work on a memory addressing mode, not a bare register. LEA kind of "undoes" the brackets, taking the result of the address calculation instead of the value from memory at that address. This can't happen if there wasn't a memory operand in the first place.

This lets you use it to do shift/add operations on arbitrary values, whether they're valid pointers or not: Using LEA on values that aren't addresses / pointers? LEA uses memory-operand syntax and machine code to encode the shift/add operation into a single instruction, using x86's normal addressing-mode encoding that the CPU hardware already knows how to decode.

Compared to mov, it's like a C & address-of operator. And you can't take the address of a register. (Or in C, of a register variable.) You can only use it to undo a dereference.

  register char *rax = ...;

  register char dl = *rax;       // mov   (%rax), %dl
  register char *rcx = rax;      // mov   %rax, %rcx
  register char *rdi = &rax[0];  // lea   (%rax), %rdi  // never do this, mov is more efficient
  register char *rbx = &rax[rdx*4 + 1234];  // lea 1234(%rax, %rdx, 4), %rbx  // a real use-case
  
  register char **rsi = &rax;    // lea %rax, %rsi   // ERROR: can't take the address of a register

Of course if you asked an actual C compiler to compile that, you'd get mov %rax, %rdi, not lea (%rax), %rdi, even if it didn't optimize away the code. This is in terms of conceptual equivalents, using C syntax and operators to explain asm, not to show how anything would or should actually compile.

来源：https://stackoverflow.com/questions/61304524/x86-64-instruction-set-att-syntax-confusion-regarding-lea-and-brackets

标签

assembly

x86-64

att

instruction-set