stdatomic

memory_order_relaxed and visibility

这一生的挚爱 提交于 2021-02-15 07:36:51
问题 Consider two threads, T1 and T2, that store and load an atomic integer a_i respectively. And let's further assume that the store is executed before the load starts being executed. By before, I mean in the absolute sense of time. T1 T2 // other_instructions here... // ... a_i.store(7, memory_order_relaxed) // other instructions here // other instructions here // ... a_i.load(memory_order_relaxed) // other instructions here Is it guaranteed that T2 sees the value 7 after the load? 回答1: Is it

Are these allowed optimizations in C++? [duplicate]

纵然是瞬间 提交于 2021-02-11 05:09:12
问题 This question already has answers here : Why don't compilers merge redundant std::atomic writes? (9 answers) Can atomic loads be merged in the C++ memory model? (2 answers) Closed 9 months ago . Let std::atomic<std::int64_t> num{0}; be defined somewhere accessible/visible in the code. Is the C++ compiler allowed to replace each of the following two codes with an empty code (something that does nothing)? Similarly, are these optimizations allowed to happen at runtime? I am just trying to get a

Are these allowed optimizations in C++? [duplicate]

偶尔善良 提交于 2021-02-11 05:08:31
问题 This question already has answers here : Why don't compilers merge redundant std::atomic writes? (9 answers) Can atomic loads be merged in the C++ memory model? (2 answers) Closed 9 months ago . Let std::atomic<std::int64_t> num{0}; be defined somewhere accessible/visible in the code. Is the C++ compiler allowed to replace each of the following two codes with an empty code (something that does nothing)? Similarly, are these optimizations allowed to happen at runtime? I am just trying to get a

Are memory orderings: consume, acq_rel and seq_cst ever needed on Intel x86?

北战南征 提交于 2021-02-08 14:36:43
问题 C++11 specifies six memory orderings: typedef enum memory_order { memory_order_relaxed, memory_order_consume, memory_order_acquire, memory_order_release, memory_order_acq_rel, memory_order_seq_cst } memory_order; https://en.cppreference.com/w/cpp/atomic/memory_order where the default is seq_cst. Performance gains can be found by relaxing the memory ordering of operations. However, this depends on what protections the architecture provides. For example, Intel x86 is a strong memory model and

Does this envelope implementation correctly use C++11 atomics?

ぃ、小莉子 提交于 2021-02-07 13:15:54
问题 I have written a simple 'envelope' class to make sure I understand the C++11 atomic semantics correctly. I have a header and a payload, where the writer clears the header, fills in the payload, then fills the header with an increasing integer. The idea is that a reader then can read the header, memcpy out the payload, read the header again, and if the header is the same the reader can then assume they successfully copied the payload. It's OK that the reader may miss some updates, but it's not

std::atomic<bool> lock-free inconsistency on ARM (raspberry pi 3)

时间秒杀一切 提交于 2021-01-27 07:01:34
问题 I had a problem with a static assert. The static assert was exactly like this: static_assert(std::atomic<bool>::is_always_lock_free); and the code failed on Raspberry Pi 3 (Linux raspberrypi 4.19.118-v7+ #1311 SMP Mon Apr 27 14:21:24 BST 2020 armv7l GNU/Linux). On the cppreference.com atomic::is_always_lock_free reference site it is stated that: Equals true if this atomic type is always lock-free and false if it is never or sometimes lock-free. The value of this constant is consistent with

For purposes of ordering, is atomic read-modify-write one operation or two?

痴心易碎 提交于 2021-01-22 06:18:01
问题 Consider an atomic read-modify-write operation such as x.exchange(..., std::memory_order_acq_rel) . For purposes of ordering with respect to loads and stores to other objects, is this treated as: a single operation with acquire-release semantics? Or, as an acquire load followed by a release store, with the added guarantee that other loads and stores to x will observe both of them or neither? If it's #2, then although no other operations in the same thread could be reordered before the load or

C++ How is release-and-acquire achieved on x86 only using MOV?

帅比萌擦擦* 提交于 2021-01-20 03:48:29
问题 This question is a follow-up/clarification to this: Does the MOV x86 instruction implement a C++11 memory_order_release atomic store? This states the MOV assembly instruction is sufficient to perform acquire-release semantics on x86. We do not need LOCK , fences or xchg etc. However, I am struggling to understand how this works. Intel doc Vol 3A Chapter 8 states: https://software.intel.com/sites/default/files/managed/7c/f1/253668-sdm-vol-3a.pdf In a single-processor (core) system.... Reads

C++ How is release-and-acquire achieved on x86 only using MOV?

徘徊边缘 提交于 2021-01-20 03:44:36
问题 This question is a follow-up/clarification to this: Does the MOV x86 instruction implement a C++11 memory_order_release atomic store? This states the MOV assembly instruction is sufficient to perform acquire-release semantics on x86. We do not need LOCK , fences or xchg etc. However, I am struggling to understand how this works. Intel doc Vol 3A Chapter 8 states: https://software.intel.com/sites/default/files/managed/7c/f1/253668-sdm-vol-3a.pdf In a single-processor (core) system.... Reads

Is the transformation of fetch_add(0, memory_order_relaxed/release) to mfence + mov legal?

老子叫甜甜 提交于 2020-12-30 06:32:27
问题 The paper N4455 No Sane Compiler Would Optimize Atomics talks about various optimizations compilers can apply to atomics. Under the section Optimization Around Atomics, for the seqlock example, it mentions a transformation implemented in LLVM, where a fetch_add(0, std::memory_order_release) is turned into a mfence followed by a plain load, rather than the usual lock add or xadd . The idea is that this avoids taking exclusive access of the cacheline, and is relatively cheaper. The mfence is