Acquire/Release versus Sequentially Consistent memory order

前端 未结 2 1088
半阙折子戏
半阙折子戏 2020-12-07 12:42

For any std::atomic where T is a primitive type:

If I use std::memory_order_acq_rel for fetch_xxx operations, and

2条回答
  •  离开以前
    2020-12-07 13:06

    The C++11 memory ordering parameters for atomic operations specify constraints on the ordering. If you do a store with std::memory_order_release, and a load from another thread reads the value with std::memory_order_acquire then subsequent read operations from the second thread will see any values stored to any memory location by the first thread that were prior to the store-release, or a later store to any of those memory locations.

    If both the store and subsequent load are std::memory_order_seq_cst then the relationship between these two threads is the same. You need more threads to see the difference.

    e.g. std::atomic variables x and y, both initially 0.

    Thread 1:

    x.store(1,std::memory_order_release);
    

    Thread 2:

    y.store(1,std::memory_order_release);
    

    Thread 3:

    int a=x.load(std::memory_order_acquire); // x before y
    int b=y.load(std::memory_order_acquire); 
    

    Thread 4:

    int c=y.load(std::memory_order_acquire); // y before x
    int d=x.load(std::memory_order_acquire);
    

    As written, there is no relationship between the stores to x and y, so it is quite possible to see a==1, b==0 in thread 3, and c==1 and d==0 in thread 4.

    If all the memory orderings are changed to std::memory_order_seq_cst then this enforces an ordering between the stores to x and y. Consequently, if thread 3 sees a==1 and b==0 then that means the store to x must be before the store to y, so if thread 4 sees c==1, meaning the store to y has completed, then the store to x must also have completed, so we must have d==1.

    In practice, then using std::memory_order_seq_cst everywhere will add additional overhead to either loads or stores or both, depending on your compiler and processor architecture. e.g. a common technique for x86 processors is to use XCHG instructions rather than MOV instructions for std::memory_order_seq_cst stores, in order to provide the necessary ordering guarantees, whereas for std::memory_order_release a plain MOV will suffice. On systems with more relaxed memory architectures the overhead may be greater, since plain loads and stores have fewer guarantees.

    Memory ordering is hard. I devoted almost an entire chapter to it in my book.

提交回复
热议问题