memory-barriers | 易学教程

Is there an implicit memory barrier with synchronized-with relationship on thread::join?

阅读更多关于 Is there an implicit memory barrier with synchronized-with relationship on thread::join?

I have a code at work that starts multiple threads that doing some operations and if any of them fail they set the shared variable to false . Then main thread joins all the worker threads. Simulation of this looks roughly like this (I commented out the possible fix which I don't know if it's needed): #include <thread> #include <atomic> #include <vector> #include <iostream> #include <cassert> using namespace std; //atomic_bool success = true; bool success = true; int main() { vector<thread> v; for (int i = 0; i < 10; ++i) { v.emplace_back([=] { if (i == 5 || i == 6) { //success.store(false,

Is there an implicit memory barrier with synchronized-with relationship on thread::join?

阅读更多关于 Is there an implicit memory barrier with synchronized-with relationship on thread::join?

问题 I have a code at work that starts multiple threads that doing some operations and if any of them fail they set the shared variable to false . Then main thread joins all the worker threads. Simulation of this looks roughly like this (I commented out the possible fix which I don't know if it's needed): #include <thread> #include <atomic> #include <vector> #include <iostream> #include <cassert> using namespace std; //atomic_bool success = true; bool success = true; int main() { vector<thread> v;

Can DMB instructions be safely omitted in ARM Cortex M4

阅读更多关于 Can DMB instructions be safely omitted in ARM Cortex M4

I am going through the assembly generated by GCC for an ARM Cortex M4, and noticed that atomic_compare_exchange_weak gets two DMB instructions inserted around the condition (compiled with GCC 4.9 using -std=gnu11 -O2 ): // if (atomic_compare_exchange_weak(&address, &x, y)) dmb sy ldrex r0, [r3] cmp r0, r2 itt eq strexeq lr, r1, [r3] cmpeq.w lr, #0 dmb sy bne.n ... Since the programming guide to barrier instructions for ARM Cortex M4 states that: Omitting the DMB or DSB instruction in the examples in Figure 41 and Figure 42 would not cause any error because the Cortex-M processors: do not re

Can DMB instructions be safely omitted in ARM Cortex M4

阅读更多关于 Can DMB instructions be safely omitted in ARM Cortex M4

问题 I am going through the assembly generated by GCC for an ARM Cortex M4, and noticed that atomic_compare_exchange_weak gets two DMB instructions inserted around the condition (compiled with GCC 4.9 using -std=gnu11 -O2 ): // if (atomic_compare_exchange_weak(&address, &x, y)) dmb sy ldrex r0, [r3] cmp r0, r2 itt eq strexeq lr, r1, [r3] cmpeq.w lr, #0 dmb sy bne.n ... Since the programming guide to barrier instructions for ARM Cortex M4 states that: Omitting the DMB or DSB instruction in the

asio implicit strand and data synchronization

阅读更多关于 asio implicit strand and data synchronization

When I read asio source code, I am curious about how asio making data synchronized between threads even a implicit strand was made. These are code in asio: io_service::run mutex::scoped_lock lock(mutex_); std::size_t n = 0; for (; do_run_one(lock, this_thread, ec); lock.lock()) if (n != (std::numeric_limits<std::size_t>::max)()) ++n; return n; io_service::do_run_one while (!stopped_) { if (!op_queue_.empty()) { // Prepare to execute first handler from queue. operation* o = op_queue_.front(); op_queue_.pop(); bool more_handlers = (!op_queue_.empty()); if (o == &task_operation_) { task

asio implicit strand and data synchronization

阅读更多关于 asio implicit strand and data synchronization

问题 When I read asio source code, I am curious about how asio making data synchronized between threads even a implicit strand was made. These are code in asio: io_service::run mutex::scoped_lock lock(mutex_); std::size_t n = 0; for (; do_run_one(lock, this_thread, ec); lock.lock()) if (n != (std::numeric_limits<std::size_t>::max)()) ++n; return n; io_service::do_run_one while (!stopped_) { if (!op_queue_.empty()) { // Prepare to execute first handler from queue. operation* o = op_queue_.front();

Thread safe usage of lock helpers (concerning memory barriers)

阅读更多关于 Thread safe usage of lock helpers (concerning memory barriers)

By lock helpers I am referring to disposable objects with which locking can be implemented via using statements. For example, consider a typical usage of the SyncLock class from Jon Skeet's MiscUtil : public class Example { private readonly SyncLock _padlock; public Example() { _padlock = new SyncLock(); } public void ConcurrentMethod() { using (_padlock.Lock()) { // Now own the padlock - do concurrent stuff } } } Now, consider the following usage: var example = new Example(); new Thread(example.ConcurrentMethod).Start(); My question is this - since example is created on one thread and

How is load->store reordering possible with in-order commit?

阅读更多关于 How is load->store reordering possible with in-order commit?

ARM allows the reordering loads with subsequent stores, so that the following pseudocode: // CPU 0 | // CPU 1 temp0 = x; | temp1 = y; y = 1; | x = 1; can result in temp0 == temp1 == 1 (and, this is observable in practice as well). I'm having trouble understanding how this occurs; it seems like in-order commit would prevent it (which, it was my understanding, is present in pretty much all OOO processors). My reasoning goes "the load must have its value before it commits, it commits before the store, and the store's value can't become visible to other processors until it commits." I'm guessing

Does `xchg` encompass `mfence` assuming no non-temporal instructions?

阅读更多关于 Does `xchg` encompass `mfence` assuming no non-temporal instructions?

I have already seen this answer and this answer , but neither appears to clear and explicit about the equivalence or non-equivalence of mfence and xchg under the assumption of no non-temporal instructions. The Intel instruction reference for xchg mentions that this instruction is useful for implementing semaphores or similar data structures for process synchronization , and further references Chapter 8 of Volume 3A . That reference states the following. For the P6 family processors, locked operations serialize all outstanding load and store operations (that is, wait for them to complete). This

How is load->store reordering possible with in-order commit?

阅读更多关于 How is load->store reordering possible with in-order commit?

问题 ARM allows the reordering loads with subsequent stores, so that the following pseudocode: // CPU 0 | // CPU 1 temp0 = x; | temp1 = y; y = 1; | x = 1; can result in temp0 == temp1 == 1 (and, this is observable in practice as well). I'm having trouble understanding how this occurs; it seems like in-order commit would prevent it (which, it was my understanding, is present in pretty much all OOO processors). My reasoning goes "the load must have its value before it commits, it commits before the