memory-barriers | 易学教程

Is there any compiler barrier which is equal to asm(“” ::: “memory”) in C++11?

阅读更多关于 Is there any compiler barrier which is equal to asm(“” ::: “memory”) in C++11?

问题 My test code is as below, and I found that only the memory_order_seq_cst forbade compiler's reorder. #include <atomic> using namespace std; int A, B = 1; void func(void) { A = B + 1; atomic_thread_fence(memory_order_seq_cst); B = 0; } And other choices such as memory_order_release , memory_order_acq_rel did not generate any compiler barrier at all. I think they must work with atomic variable just as below. #include <atomic> using namespace std; atomic<int> A(0); int B = 1; void func(void) { A

Does an x86 CPU reorder instructions?

阅读更多关于 Does an x86 CPU reorder instructions?

问题 I have read that some CPUs reorder instructions, but this is not a problem for single threaded programs (the instructions would still be reordered in single threaded programs, but it would appear as if the instructions were executed in order), it is only a problem for multithreaded programs. To solve the problem of instructions reordering, we can insert memory barriers in the appropriate places in the code. But does an x86 CPU reorder instructions? If it does not, then there is no need to use

Does std::mutex create a fence?

阅读更多关于 Does std::mutex create a fence?

问题 If I lock a std::mutex will I always get a memory fence? I am unsure if it implies or enforces you to get the fence. Update: Found this reference following up on RMF's comments. Multithreaded programming and memory visibility 回答1: Unlocking a mutex synchronizes with locking the mutex. I don't know what options the compiler has for the implementation, but you get the same effect of a fence. 回答2: As I understand this is covered in: 1.10 Multi-threaded executions and data races Para 5: The

Does a memory barrier ensure that the cache coherence has been completed?

阅读更多关于 Does a memory barrier ensure that the cache coherence has been completed?

问题 Say I have two threads that manipulate the global variable x . Each thread (or each core I suppose) will have a cached copy of x . Now say that Thread A executes the following instructions: set x to 5 some other instruction Now when set x to 5 is executed, the cached value of x will be set to 5 , this will cause the cache coherence protocol to act and update the caches of the other cores with the new value of x . Now my question is: when x is actually set to 5 in Thread A 's cache, do the

Analyzing of x86 output generated by JIT in the context of volatile

阅读更多关于 Analyzing of x86 output generated by JIT in the context of volatile

问题 I am writting this post in connection to Deep understanding of volatile in Java public class Main { private int x; private volatile int g; public void actor1(){ x = 1; g = 1; } public void actor2(){ put_on_screen_without_sync(g); put_on_screen_without_sync(x); } } Now, I am analyzing what JIT generated for above piece of code. From our discussion in my previous post we know that output 1, 0 is impossible because: write to volatile v causes that every action a preceeding v causes that a will

Should thread-safe class have a memory barrier at the end of its constructor?

阅读更多关于 Should thread-safe class have a memory barrier at the end of its constructor?

问题 When implementing a class intended to be thread-safe, should I include a memory barrier at the end of its constructor, in order to ensure that any internal structures have completed being initialized before they can be accessed? Or is it the responsibility of the consumer to insert the memory barrier before making the instance available to other threads? Simplified question : Is there a race hazard in the code below that could give erroneous behaviour due to the lack of a memory barrier

Why we need Thread.MemoryBarrier()?

阅读更多关于 Why we need Thread.MemoryBarrier()?

In "C# 4 in a Nutshell", the author shows that this class can write 0 sometimes without MemoryBarrier , though I can't reproduce in my Core2Duo: public class Foo { int _answer; bool _complete; public void A() { _answer = 123; //Thread.MemoryBarrier(); // Barrier 1 _complete = true; //Thread.MemoryBarrier(); // Barrier 2 } public void B() { //Thread.MemoryBarrier(); // Barrier 3 if (_complete) { //Thread.MemoryBarrier(); // Barrier 4 Console.WriteLine(_answer); } } } private static void ThreadInverteOrdemComandos() { Foo obj = new Foo(); Task.Factory.StartNew(obj.A); Task.Factory.StartNew(obj.B

Does it make any sense to use the LFENCE instruction on x86/x86_64 processors?

阅读更多关于 Does it make any sense to use the LFENCE instruction on x86/x86_64 processors?

问题 Often in internet I find that LFENCE makes no sense in processors x86, ie it does nothing , so instead MFENCE we can absolutely painless to use SFENCE , because MFENCE = SFENCE + LFENCE = SFENCE + NOP = SFENCE . But if LFENCE does not make sense, then why we have four approaches to make Sequential Consistency in x86/x86_64: LOAD (without fence) and STORE + MFENCE LOAD (without fence) and LOCK XCHG MFENCE + LOAD and STORE (without fence) LOCK XADD ( 0 ) and STORE (without fence) Taken from

Memory barrier generators

阅读更多关于 Memory barrier generators

Reading Joseph Albahari's threading tutorial , the following are mentioned as generators of memory barriers: C#'s lock statement ( Monitor.Enter / Monitor.Exit ) All methods on the Interlocked class Asynchronous callbacks that use the thread pool — these include asynchronous delegates, APM callbacks, and Task continuations Setting and waiting on a signaling construct Anything that relies on signaling, such as starting or waiting on a Task In addition, Hans Passant and Brian Gideon added the following (assuming none of which already fits into one of the previous categories): Starting or waking

Globally Invisible load instructions

阅读更多关于 Globally Invisible load instructions

问题 Can some of the load instructions be never globally visible due to store load forwarding ? To put it another way, if a load instruction gets its value from the store buffer, it never has to read from the cache. As it is generally stated that a load is globally visible when it reads from the L1D cache, the ones that do not read from the L1D should make it globally invisible. 回答1: The concept of global visibility for loads is tricky, because a load doesn't modify the global state of memory, and