Questions about AT&T x86 Syntax design

后端未结

关注

 4  1849

天涯浪人 2020-12-01 10:47

Can anyone explain to me why every constant in AT&T syntax has a \'$\' in front of it?
Why do all registers have a \'%\'?
Is this just another at

4条回答

时光说笑 (楼主)

2020-12-01 11:31
The GNU assembler's AT&T syntax traces its origins to the Unix assembler ¹, which itself took its input syntax mostly from the PDP-11 PAL-11 assembler (ca. 1970).

Can anyone explain to me why every constant in AT&T syntax has a '$' in front of it?

It allows to distinguish immediate constants from memory addresses. Intel syntax does it the other way around, with memory references as [foo].

Incidentally, MASM (the Microsoft Assembler) doesn't need a distinction at the syntax level, since it can tell whether the operand is a symbolic constant, or a label. Other assemblers for x86 actively avoid such guesses, since they can be confusing to readers, e.g: TASM in IDEAL mode (it warns on memory references not in brackets), nasm, fasm.

PAL-11 used # for the Immediate addressing mode, where the operand followed the instruction. A constant without # meant Relative addressing mode, where a relative address followed the instruction.

Unix as used the same syntax for addressing modes as DEC assemblers, with * instead of @, and $ instead of #, since @ and # were apparently inconvenient to type ².

Why do all registers have a '%'?

In PAL-11, registers were defined as R0=%0, R1=%1, ... with R6 also referred to as SP, and R7 also referred to as PC. The DEC MACRO-11 macro-assembler allowed referring to registers as %x, where x could be an arbitrary expression, e.g. %3+1 referred to %4.

Is this just another attempt to get me to do a lot of lame typing?

Nope.

Also, am I the only one that finds: 16(%esp) really counterintuitive compared to [esp+16]?

This comes from the PDP-11 Index addressing mode, where a memory address is formed by summing the contents of a register and an index word following the instruction.

I know it compiles to the same thing but why would anyone want to type a lot of '$' and '%'s without a need to? - Why did GNU choose this syntax as the default?

It came from the PDP-11.

Another thing, why is every instruction in at&t syntax preceded by an: l? - I do know its for the operand sizes, however why not just let the assembler figure that out? (would I ever want to do a movl on operands that are not that size?)

gas can usually figure it out. Other assemblers also need help in particular cases.

The PDP-11 would use b for byte instructions, e.g: CLR vs CLRB. Other suffixes appeared in VAX-11: l for long, w for word, f for float, d for double, q for quad-word, ...
```
Last thing: why are the mov arguments inverted?
```
Arguably, since the PDP-11 predates Intel microprocessors, it is the other way around.
1. According to gas info-page, through the BSD 4.2 assembler.
2. Unix Assembler Reference Manual §8.1 - Dennis M. Ritchie
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...