memory-barriers

What is the (slight) difference on the relaxing atomic rules?

守給你的承諾、 提交于 2020-08-25 10:30:06
问题 After seeing Herb Sutters excellent talk about "atomic weapons" I got a bit confused about the Relaxed Atomics examples. I took with me that an atomic in the C++ Memory Model (SC-DRF = Sequentially Consistent for Data Race Free) does an "acquire" on a load/read. I understand that for a load [and a store] the default is std::memory_order_seq_cst and therefore the two are the same: myatomic.load(); // (1) myatomic.load(std::memory_order_seq_cst); // (2) So far so good, no Relaxed Atomics

Can an atomic release be “overwritten”?

最后都变了- 提交于 2020-08-05 07:31:31
问题 Say I have atomic<int> i; Thread A performs an atomic store/exchange with memory_order_release. Next, Thread B performs an atomic store with memory_order_release. Thread C performs an atomic fetch_add(0, memory_order_acquire); Does Thread C acquire dependencies from thread A and B or only thread B ? 回答1: Only B (I'm going to assume that by "next" you mean the modification order of the atomic is A -> B -> C so that by [atomics.order]p11 C 's RMW must read the value B wrote). See the note in

Can an atomic release be “overwritten”?

有些话、适合烂在心里 提交于 2020-08-05 07:31:10
问题 Say I have atomic<int> i; Thread A performs an atomic store/exchange with memory_order_release. Next, Thread B performs an atomic store with memory_order_release. Thread C performs an atomic fetch_add(0, memory_order_acquire); Does Thread C acquire dependencies from thread A and B or only thread B ? 回答1: Only B (I'm going to assume that by "next" you mean the modification order of the atomic is A -> B -> C so that by [atomics.order]p11 C 's RMW must read the value B wrote). See the note in

Why does this `std::atomic_thread_fence` work

て烟熏妆下的殇ゞ 提交于 2020-07-09 05:21:20
问题 Firstly I want to list some of my undertandings regarding to this, please correct me if I'm wrong. a MFENCE in x86 can ensure a full barrier Sequential-Consistency prevents reordering of STORE-STORE, STORE-LOAD, LOAD-STORE and LOAD-LOAD This is according to Wikipedia. std::memory_order_seq_cst makes no guarantee to prevent STORE-LOAD reorder. This is according to Alex's answer, "Loads May Be Reordered with Earlier Stores to Different Locations"(for x86) and mfence will not always be added.

Why does this `std::atomic_thread_fence` work

孤街浪徒 提交于 2020-07-09 05:21:01
问题 Firstly I want to list some of my undertandings regarding to this, please correct me if I'm wrong. a MFENCE in x86 can ensure a full barrier Sequential-Consistency prevents reordering of STORE-STORE, STORE-LOAD, LOAD-STORE and LOAD-LOAD This is according to Wikipedia. std::memory_order_seq_cst makes no guarantee to prevent STORE-LOAD reorder. This is according to Alex's answer, "Loads May Be Reordered with Earlier Stores to Different Locations"(for x86) and mfence will not always be added.

How is std::atomic<T>::notify_all ordered?

空扰寡人 提交于 2020-07-06 13:55:48
问题 I expect the below program not to hang. If (2) and (3) are observed in reverse order in (1), it may hang due to lost notification: #include <atomic> #include <chrono> #include <thread> int main() { std::atomic<bool> go{ false }; std::thread thd([&go] { go.wait(false, std::memory_order_relaxed); // (1) }); std::this_thread::sleep_for(std::chrono::milliseconds(400)); go.store(true, std::memory_order_relaxed); // (2) go.notify_all(); // (3) thd.join(); return 0; } So the question is what would

An implementation of std::atomic_thread_fence(std::memory_order_seq_cst) on x86 without extra performance penalties

二次信任 提交于 2020-06-16 19:09:40
问题 A following-up question for Why does this `std::atomic_thread_fence` work As a dummy interlocked operation is better than _mm_mfence , and there are quite many ways to implement it, which interlocked operation and on what data should be used? Assume using an inline assembly that is not aware of surrounding context, but can tell the compiler which registers it clobbers. 回答1: Short answer for now, without going into too much detail about why. See specifically the discussion in comments on that

An implementation of std::atomic_thread_fence(std::memory_order_seq_cst) on x86 without extra performance penalties

爷,独闯天下 提交于 2020-06-16 19:08:06
问题 A following-up question for Why does this `std::atomic_thread_fence` work As a dummy interlocked operation is better than _mm_mfence , and there are quite many ways to implement it, which interlocked operation and on what data should be used? Assume using an inline assembly that is not aware of surrounding context, but can tell the compiler which registers it clobbers. 回答1: Short answer for now, without going into too much detail about why. See specifically the discussion in comments on that

An implementation of std::atomic_thread_fence(std::memory_order_seq_cst) on x86 without extra performance penalties

强颜欢笑 提交于 2020-06-16 19:07:14
问题 A following-up question for Why does this `std::atomic_thread_fence` work As a dummy interlocked operation is better than _mm_mfence , and there are quite many ways to implement it, which interlocked operation and on what data should be used? Assume using an inline assembly that is not aware of surrounding context, but can tell the compiler which registers it clobbers. 回答1: Short answer for now, without going into too much detail about why. See specifically the discussion in comments on that

An implementation of std::atomic_thread_fence(std::memory_order_seq_cst) on x86 without extra performance penalties

隐身守侯 提交于 2020-06-16 19:07:04
问题 A following-up question for Why does this `std::atomic_thread_fence` work As a dummy interlocked operation is better than _mm_mfence , and there are quite many ways to implement it, which interlocked operation and on what data should be used? Assume using an inline assembly that is not aware of surrounding context, but can tell the compiler which registers it clobbers. 回答1: Short answer for now, without going into too much detail about why. See specifically the discussion in comments on that