x86 XOR opcode differences

问题

looking at http://ref.x86asm.net/coder32.html I found two opcodes that match for the statement

xor eax,eax

1) opcode 31 XOR r/m16/32 r16/32

2) opcode 33 XOR r16/32 r/m16/32

both refers to 32bit register for operand1 and operand2. So, is there any differences in this specific case of the XORing two 32bit registers ?

回答1:

x86 has 2 redundant ways to encode a 2-register instance of any of the basic ALU instructions that have r/m source and r/m destination forms.

This redundancy is a consequence of how x86 machine code allows a memory-destination or a memory-source for most instructions: instead of spending bits in the ModR/M byte to have a flexible encoding for both operands, there are simply two separate opcodes for most instructions.

(This is why two explicit memory operands, like xor [eax], [ecx], isn't allowed for any instruction. Only a few instructions where one or both memory operands are implicit, like rep movs or push [mem] allow two memory operands, never one instruction with two separate ModR/M-encoded addressing modes.)

For reg,reg instructions, there's no difference in how they decode and execute on any CPUs I'm aware of; the only time you need to care about which encoding your assembler uses is when you want the machine code to meet some other requirement, like using only bytes that represent printable ASCII characters. (e.g. for an exploit payload).

Some assemblers have syntax for overriding their default choice of encoding, e.g. GAS had a .s suffix to get the non-default encoding. That's now deprecated, and you should use {load} or {store} prefixes before the mnemonic (see the docs), like so:

{load} xor %eax, %ecx
{store} xor %eax, %ecx
{vex3} vpaddd %xmm0, %xmm1, %xmm1
vpaddd %xmm0, %xmm1, %xmm1        # default is to use 2-byte VEX when possible

gcc -c foo.S && objdump -drwC foo.o

0:   31 c1                   xor    %eax,%ecx
2:   33 c8                   xor    %eax,%ecx
4:   c4 e1 71 fe c8          vpaddd %xmm0,%xmm1,%xmm1
9:   c5 f1 fe c8             vpaddd %xmm0,%xmm1,%xmm1

(Related: What methods can be used to efficiently extend instruction length on modern x86? for use-cases for {vex3}, {evex} and {disp32}.)

NASM also has {vex2}, {vex3}, and {evex} prefixes with the same syntax as GAS: {vex3} vpaddd xmm1, xmm1, xmm0. But I don't see a way to override the op r/m, r vs. op r, r/m choice of opcodes.

来源：https://stackoverflow.com/questions/50336269/x86-xor-opcode-differences

标签

assembly

x86

bit-manipulation

xor

opcode