What does the MOVZBL instruction do in IA-32 AT&T syntax?

时光总嘲笑我的痴心妄想 提交于 2019-11-28 08:54:22
Igor Skochinsky

AT&T syntax splits the movzx Intel instruction mnemonic into different mnemonics for different source sizes (movzb vs. movzw). In Intel syntax, it's:

movzx eax, byte ptr [eax+ecx+1]

i.e. load a byte from memory at eax+ecx+1 and zero-extend to full register.

BTW, most GNU tools now have a switch or a config option to prefer Intel syntax. (Such as objdump -Mintel or gcc -S -masm=intel, although the latter affects the syntax used when compiling inline-asm). I would certainly recommend to look into it, if you don't do AT&T assembly for living. See also the tag wiki for more docs and guides.

Minimal example

mov $0x01234567, %eax
mov $1, %bl
movzbl %bl, %eax
/* %eax == 0000 0001 */

mov $0x01234567, %eax
mov $-1, %bl
movzbl %bl, %eax
/* %eax == 0000 00FF */

Runanble GitHub upstream with assertions.

The mnemonic is:

  • MOV
  • Zero extend
  • Byte (8-bit)
  • to Long (32-bit)

There are also versions for other sizes:

  • movzbw: Byte (8-bit) to Word (16-bit)
  • movzwl: Word (16-bit) to Long (32-bit)

Like most GAS instructions, you can omit the last size character when dealing with registers:

movzb %bl, %eax

but I cannot understand why we can't omit the before last letter, e.g. the following fails:

movz %bl, %eax

Why not just deduce it from the size of the operands when they are registers as for mov and Intel syntax?

And if you use registers of the wrong size, it fails to compile e.g.:

movzb %ax, %eax
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!