Intel JCC Erratum - should JCC really be treated separately?

喜欢而已 提交于 2020-06-27 17:21:07

问题


Intel pushed microcode update to fix error called "Jump Conditional Code (JCC) Erratum". The update microcode caused some operation to be inefficient due to disabling putting code to ICache under certain conditions.

Published document, titled Mitigations for Jump Conditional Code Erratum lists not only JCC, it lists: unconditional jumps, conditional jumps, macro-fused conditional jumps, calls, and return.

MSVC switch /QIntel-jcc-erratum documentation mentions:

Under /QIntel-jcc-erratum, the compiler detects jump and macro-fused jump instructions that cross or end on a 32-byte boundary.

The questions are:

  • Are there reasons to treat JCC separately from other jumps?
  • Are there reasons to treat macro-fused JCC mentioned separately from other JCC?

回答1:


Macro-fused jumps have to be mentioned separately because it means the whole cmp/jcc or whatever is vulnerable to this slowdown if the cmp touches the boundary when the jcc itself doesn't. Because the uop cache would have a single uop for both those x86 machine instructions together, with the start address of the non-jump instruction.

If everyone only said "jumps", you'd expect that only the JCC / JMP / CALL / RET itself had to avoid touching a 32B boundary. So it's a good thing to highlight the interaction with macro-fusion.


This slowdown (for all jumps) is the result of a microcode mitigation / workaround for a hardware design flaw. Not being able to uop-cache cache jumps that touch a 32-byte boundary is not the original erratum, it's a side effect of the cure.

That original erratum description doesn't say anything about affecting only conditional branches. Even if it was only conditional branches that were a real problem, perhaps the best way Intel could find to make it safe with a microcode update unfortunately affected all jumps.

For example, in Skylake-Xeon (SKX), the original erratum is documented as SKX102 in Intel's "spec update" errata list for that uarch:

SKX102. Processor May Behave Unpredictably on Complex Sequence of Conditions Which Involve Branches That Cross 64 Byte Boundaries

Problem: Under complex micro-architectural conditions involving branch instructions bytes that span multiple 64 byte boundaries (cross cache line), unpredictable system behavior may occur.

Implication: When this erratum occurs, the system may behave unpredictably.

Workaround: It is possible for BIOS to contain a workaround for this erratum. [i.e. a microcode update]

Status: No fix.


I suspect the "JCC erratum" name caught on because most branches in "hot" are conditional. Compilers can usually avoid putting unconditional taken branches in the fast path. So it's likely that people noticed the performance problem with JCC instructions first, and that name simply stuck even though it's not accurate.

BTW, 32-byte aligned routine does not fit the uops cache has a screenshot of the relevant diagram from the Intel PDF you linked about, and some other links and details about performance effects.



来源:https://stackoverflow.com/questions/62305998/intel-jcc-erratum-should-jcc-really-be-treated-separately

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!