cpu-architecture

Does a Length-Changing Prefix (LCP) incur a stall on a simple x86_64 instruction?

核能气质少年 提交于 2021-01-20 04:49:33
问题 Consider a simple instruction like mov RCX, RDI # 48 89 f9 The 48 is the REX prefix for x86_64. It is not an LCP. But consider adding an LCP (for alignment purposes): .byte 0x67 mov RCX, RDI # 67 48 89 f9 67 is an address size prefix which in this case is for an instruction without addresses. This instruction also has no immediates, and it doesn't use the F7 opcode (False LCP stalls; F7 would be TEST, NOT, NEG, MUL, IMUL, DIV + IDIV). Assume that it doesn't cross a 16-byte boundary either.

Does a Length-Changing Prefix (LCP) incur a stall on a simple x86_64 instruction?

浪尽此生 提交于 2021-01-20 04:48:03
问题 Consider a simple instruction like mov RCX, RDI # 48 89 f9 The 48 is the REX prefix for x86_64. It is not an LCP. But consider adding an LCP (for alignment purposes): .byte 0x67 mov RCX, RDI # 67 48 89 f9 67 is an address size prefix which in this case is for an instruction without addresses. This instruction also has no immediates, and it doesn't use the F7 opcode (False LCP stalls; F7 would be TEST, NOT, NEG, MUL, IMUL, DIV + IDIV). Assume that it doesn't cross a 16-byte boundary either.

Does a Length-Changing Prefix (LCP) incur a stall on a simple x86_64 instruction?

落爺英雄遲暮 提交于 2021-01-20 04:47:11
问题 Consider a simple instruction like mov RCX, RDI # 48 89 f9 The 48 is the REX prefix for x86_64. It is not an LCP. But consider adding an LCP (for alignment purposes): .byte 0x67 mov RCX, RDI # 67 48 89 f9 67 is an address size prefix which in this case is for an instruction without addresses. This instruction also has no immediates, and it doesn't use the F7 opcode (False LCP stalls; F7 would be TEST, NOT, NEG, MUL, IMUL, DIV + IDIV). Assume that it doesn't cross a 16-byte boundary either.

What is a CPU thread and how is it related to logical threads in code?

爱⌒轻易说出口 提交于 2021-01-18 12:50:51
问题 I have been seeing in the literature for some of the newer CPU's such as the Intel Xeon "Nehalem-EX" as having 8 cores and 16 threads. What are they talking about here? I saw mention of this in reference so SPARCS too, surely this isn't the kind of logical threads spawned by code ? Is this hyperthreading re-named? 回答1: Yes, Nehalem-based processors implement Hyper-threading. The new Nehalem-EX which you refer to has 8 physical cores where each core can be seen as 2 logical cores for a total

What is a CPU thread and how is it related to logical threads in code?

纵然是瞬间 提交于 2021-01-18 12:50:44
问题 I have been seeing in the literature for some of the newer CPU's such as the Intel Xeon "Nehalem-EX" as having 8 cores and 16 threads. What are they talking about here? I saw mention of this in reference so SPARCS too, surely this isn't the kind of logical threads spawned by code ? Is this hyperthreading re-named? 回答1: Yes, Nehalem-based processors implement Hyper-threading. The new Nehalem-EX which you refer to has 8 physical cores where each core can be seen as 2 logical cores for a total

How does Intel X86 implements total order over stores

微笑、不失礼 提交于 2021-01-05 09:16:06
问题 X86 guarantees a total order over all stores due to its TSO memory model. My question is if anyone has an idea how this is actually implemented. I have a good impression how all the 4 fences are implemented, so I can explain how local order is preserved. But the 4 fences will just give PO; it won't give you TSO (I know TSO allows older stores to jump in front of newer loads so only 3 out of 4 fences are needed). Total order over all memory actions over a single address is responsibility of

Observing x86 register dependencies

。_饼干妹妹 提交于 2021-01-05 07:16:24
问题 Are there any other processor registers (e.g. flags) besides the architectural registers (eax, ebx,.) in x86 for which RAW dependencies need to be enforced by the scoreboard in pipelined processors? 回答1: Literally every register guarantees that if you write it, later instructions will read the new value. x86 is defined in terms of serial execution; pipelining and out-of-order exec need to preserve that illusion for everything , including segment registers, FP rounding modes, control and debug

How do I determine the architecture of an executable binary on Windows 10

放肆的年华 提交于 2021-01-03 10:35:24
问题 Given some Random.exe on Windows, how can I determine its CPU architecture eg Intel/ARM, and its bitness eg 32 or 64. Is there a property in File Explorer, some other tool, or programatic method I can use? 回答1: The architecture of the executable is written in the Machine field of the COFF header. You can retrieve it programatically or manually with a hex editor: Go to offset 0x3C in the file. The four bytes there hold the offset of the COFF header (from the beginning of the file). Go to the

How do I determine the architecture of an executable binary on Windows 10

断了今生、忘了曾经 提交于 2021-01-03 10:35:08
问题 Given some Random.exe on Windows, how can I determine its CPU architecture eg Intel/ARM, and its bitness eg 32 or 64. Is there a property in File Explorer, some other tool, or programatic method I can use? 回答1: The architecture of the executable is written in the Machine field of the COFF header. You can retrieve it programatically or manually with a hex editor: Go to offset 0x3C in the file. The four bytes there hold the offset of the COFF header (from the beginning of the file). Go to the

How do I determine the architecture of an executable binary on Windows 10

雨燕双飞 提交于 2021-01-03 10:34:50
问题 Given some Random.exe on Windows, how can I determine its CPU architecture eg Intel/ARM, and its bitness eg 32 or 64. Is there a property in File Explorer, some other tool, or programatic method I can use? 回答1: The architecture of the executable is written in the Machine field of the COFF header. You can retrieve it programatically or manually with a hex editor: Go to offset 0x3C in the file. The four bytes there hold the offset of the COFF header (from the beginning of the file). Go to the