Endianness inside CPU registers

风流意气都作罢 提交于 2019-11-30 02:24:16

Endianness makes sense only for memory, where each byte have a numeric address. When MSByte of a value is put in higher memory address than the LSByte, it's called Littte endian, and this is the endianness of any x86 processor.

While for integers the distinction between LSByte and MSByte is clear:

    0x12345678
MSB---^^    ^^---LSB

It's not defined for string literals! It's not obvious what part of the WXYZ should be considered LSB or MSB:

1) The most obvious way,

'WXYZ' ->  0x5758595A

would lead to memory order ZYXW.

2) The not not so obvious way, when the memory order should match the order of literals:

'WXYZ' ->  0x5A595857

The assembler have to choose one of them, and apparently it chooses the second.

Endianness inside a register makes no sense since endianness describes if the byte order is from low to high memory address or from high to low memory address. Registers are not byte addressable so there is no low or high address within a register. What you are seeing is how your debugger print out the data.

The assembler is handling the two constants differently. Internally, a value in the EAX register is stored in big-endian format. You can see that by writing:

mov eax, 1

If you inspect the register, you'll see that its value is 0x00000001.

When you tell the assembler that you want the constant value 0x78ff5abc, that's exactly what gets stored in the register. The high 8 bits of EAX will contain 0x78, and the AL register contains 0xbc.

Now if you were to store the value from EAX into memory, it would be laid out in memory in the reverse order. That is, if you were to write:

mov [addr],eax

And then inspected memory at [addr], you would see 0xbc, 0x5a, 0xff, 0x78.

In the case of 'WXYZ', the assembler assumes that you want to load the value such that if you were to write it to memory, it would be laid out as 0x57, 0x58, 0x59, 0x5a.

Take a look at the code bytes that the assembler generates and you'll see the difference. In the case of mov eax,0x78ff5abc, you'll see:

<opcodes for mov eax>, 0xbc, 0x5a, 0xff, 0x78

In the case of mov eax,WXYZ, you'll see:

<opcodes for mov eax>, 0x57, 0x58, 0x59, 0x5a

In simple words, treat registers as just values, endiannes on how they are finally stored is not important.

You know that writing on eax you write a 32 bit number, and you know that reading from eax you will read the same 32 bit number. In this terms, endianness doesn't matter.

Than you know that in "al" you have less significant 8-bit part of the value, in "ah" most significan 8-bit part of the lower 16 bits. There is no way to access single bytes on higher 16bits, except of course reading the whole 32 bit value.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!