Alignment requirements for atomic x86 instructions vs. MS's InterlockedCompareExchange documentation?

前端 未结 4 1974
灰色年华
灰色年华 2020-11-28 07:06

Microsoft offers the InterlockedCompareExchange function for performing atomic compare-and-swap operations. There is also an _InterlockedCompareExchange intrinsic.<

4条回答
  •  情书的邮戳
    2020-11-28 07:48

    The PDF you are quoting from is from 1999 and CLEARLY outdated.

    The up-to-date Intel documentation, specifically Volume-3A tells a different story.

    For example, on a Core-i7 processor, you STILL have to make sure your data doesn't not span over cache-lines, or else the operation is NOT guaranteed to be atomic.

    On Volume 3A, System Programming, For x86/x64 Intel clearly states:

    8.1.1 Guaranteed Atomic Operations

    The Intel486 processor (and newer processors since) guarantees that the following basic memory operations will always be carried out atomically:

    • Reading or writing a byte
    • Reading or writing a word aligned on a 16-bit boundary
    • Reading or writing a doubleword aligned on a 32-bit boundary

    The Pentium processor (and newer processors since) guarantees that the following additional memory operations will always be carried out atomically:

    • Reading or writing a quadword aligned on a 64-bit boundary
    • 16-bit accesses to uncached memory locations that fit within a 32-bit data bus

    The P6 family processors (and newer processors since) guarantee that the following additional memory operation will always be carried out atomically:

    • Unaligned 16-, 32-, and 64-bit accesses to cached memory that fit within a cache line

    Accesses to cacheable memory that are split across cache lines and page boundaries are not guaranteed to be atomic by the Intel Core 2 Duo, Intel® Atom™, Intel Core Duo, Pentium M, Pentium 4, Intel Xeon, P6 family, Pentium, and Intel486 processors. The Intel Core 2 Duo, Intel Atom, Intel Core Duo, Pentium M, Pentium 4, Intel Xeon, and P6 family processors provide bus control signals that permit external memory subsystems to make split accesses atomic; however, nonaligned data accesses will seriously impact the performance of the processor and should be avoided

提交回复
热议问题