memory-barriers

Is there any compiler barrier which is equal to asm(“” ::: “memory”) in C++11?

旧时模样 提交于 2019-11-26 21:29:17
问题 My test code is as below, and I found that only the memory_order_seq_cst forbade compiler's reorder. #include <atomic> using namespace std; int A, B = 1; void func(void) { A = B + 1; atomic_thread_fence(memory_order_seq_cst); B = 0; } And other choices such as memory_order_release , memory_order_acq_rel did not generate any compiler barrier at all. I think they must work with atomic variable just as below. #include <atomic> using namespace std; atomic<int> A(0); int B = 1; void func(void) { A

Does an x86 CPU reorder instructions?

故事扮演 提交于 2019-11-26 21:00:40
问题 I have read that some CPUs reorder instructions, but this is not a problem for single threaded programs (the instructions would still be reordered in single threaded programs, but it would appear as if the instructions were executed in order), it is only a problem for multithreaded programs. To solve the problem of instructions reordering, we can insert memory barriers in the appropriate places in the code. But does an x86 CPU reorder instructions? If it does not, then there is no need to use

Does std::mutex create a fence?

别来无恙 提交于 2019-11-26 20:48:25
问题 If I lock a std::mutex will I always get a memory fence? I am unsure if it implies or enforces you to get the fence. Update: Found this reference following up on RMF's comments. Multithreaded programming and memory visibility 回答1: Unlocking a mutex synchronizes with locking the mutex. I don't know what options the compiler has for the implementation, but you get the same effect of a fence. 回答2: As I understand this is covered in: 1.10 Multi-threaded executions and data races Para 5: The

Does a memory barrier ensure that the cache coherence has been completed?

邮差的信 提交于 2019-11-26 19:25:31
问题 Say I have two threads that manipulate the global variable x . Each thread (or each core I suppose) will have a cached copy of x . Now say that Thread A executes the following instructions: set x to 5 some other instruction Now when set x to 5 is executed, the cached value of x will be set to 5 , this will cause the cache coherence protocol to act and update the caches of the other cores with the new value of x . Now my question is: when x is actually set to 5 in Thread A 's cache, do the

Analyzing of x86 output generated by JIT in the context of volatile

柔情痞子 提交于 2019-11-26 18:39:08
问题 I am writting this post in connection to Deep understanding of volatile in Java public class Main { private int x; private volatile int g; public void actor1(){ x = 1; g = 1; } public void actor2(){ put_on_screen_without_sync(g); put_on_screen_without_sync(x); } } Now, I am analyzing what JIT generated for above piece of code. From our discussion in my previous post we know that output 1, 0 is impossible because: write to volatile v causes that every action a preceeding v causes that a will

Should thread-safe class have a memory barrier at the end of its constructor?

谁说我不能喝 提交于 2019-11-26 16:59:15
问题 When implementing a class intended to be thread-safe, should I include a memory barrier at the end of its constructor, in order to ensure that any internal structures have completed being initialized before they can be accessed? Or is it the responsibility of the consumer to insert the memory barrier before making the instance available to other threads? Simplified question : Is there a race hazard in the code below that could give erroneous behaviour due to the lack of a memory barrier

Why we need Thread.MemoryBarrier()?

丶灬走出姿态 提交于 2019-11-26 16:10:51
In "C# 4 in a Nutshell", the author shows that this class can write 0 sometimes without MemoryBarrier , though I can't reproduce in my Core2Duo: public class Foo { int _answer; bool _complete; public void A() { _answer = 123; //Thread.MemoryBarrier(); // Barrier 1 _complete = true; //Thread.MemoryBarrier(); // Barrier 2 } public void B() { //Thread.MemoryBarrier(); // Barrier 3 if (_complete) { //Thread.MemoryBarrier(); // Barrier 4 Console.WriteLine(_answer); } } } private static void ThreadInverteOrdemComandos() { Foo obj = new Foo(); Task.Factory.StartNew(obj.A); Task.Factory.StartNew(obj.B

Does it make any sense to use the LFENCE instruction on x86/x86_64 processors?

大兔子大兔子 提交于 2019-11-26 11:50:58
问题 Often in internet I find that LFENCE makes no sense in processors x86, ie it does nothing , so instead MFENCE we can absolutely painless to use SFENCE , because MFENCE = SFENCE + LFENCE = SFENCE + NOP = SFENCE . But if LFENCE does not make sense, then why we have four approaches to make Sequential Consistency in x86/x86_64: LOAD (without fence) and STORE + MFENCE LOAD (without fence) and LOCK XCHG MFENCE + LOAD and STORE (without fence) LOCK XADD ( 0 ) and STORE (without fence) Taken from

Memory barrier generators

别说谁变了你拦得住时间么 提交于 2019-11-26 11:43:07
Reading Joseph Albahari's threading tutorial , the following are mentioned as generators of memory barriers: C#'s lock statement ( Monitor.Enter / Monitor.Exit ) All methods on the Interlocked class Asynchronous callbacks that use the thread pool — these include asynchronous delegates, APM callbacks, and Task continuations Setting and waiting on a signaling construct Anything that relies on signaling, such as starting or waiting on a Task In addition, Hans Passant and Brian Gideon added the following (assuming none of which already fits into one of the previous categories): Starting or waking

Globally Invisible load instructions

我的梦境 提交于 2019-11-26 10:00:15
问题 Can some of the load instructions be never globally visible due to store load forwarding ? To put it another way, if a load instruction gets its value from the store buffer, it never has to read from the cache. As it is generally stated that a load is globally visible when it reads from the L1D cache, the ones that do not read from the L1D should make it globally invisible. 回答1: The concept of global visibility for loads is tricky, because a load doesn't modify the global state of memory, and