memory-barriers | 易学教程

Should thread-safe class have a memory barrier at the end of its constructor?

阅读更多关于 Should thread-safe class have a memory barrier at the end of its constructor?

When implementing a class intended to be thread-safe, should I include a memory barrier at the end of its constructor, in order to ensure that any internal structures have completed being initialized before they can be accessed? Or is it the responsibility of the consumer to insert the memory barrier before making the instance available to other threads? Simplified question : Is there a race hazard in the code below that could give erroneous behaviour due to the lack of a memory barrier between the initialization and the access of the thread-safe class? Or should the thread-safe class itself

When to use lock vs MemoryBarrier in .NET

阅读更多关于 When to use lock vs MemoryBarrier in .NET

问题 In .NET the lock keyword is syntactic sugar around Monitor.Enter and Monitor.Exit , so you could say that this code lock(locker) { // Do something } is the same as Monitor.Enter(locker); try { // Do Something } finally { Monitor.Exit(locker); } However the .NET framework also includes the MemoryBarrier class which works in a similar way Thread.MemoryBarrier(); //Do something Thread.MemoryBarrier(); I am confused as when I would want to use Thread.MemoryBarrier over the lock / Monitor version?

Fastest inline-assembly spinlock

阅读更多关于 Fastest inline-assembly spinlock

I'm writing a multithreaded application in c++, where performance is critical. I need to use a lot of locking while copying small structures between threads, for this I have chosen to use spinlocks. I have done some research and speed testing on this and I found that most implementations are roughly equally fast: Microsofts CRITICAL_SECTION, with SpinCount set to 1000, scores about 140 time units Implementing this algorithm with Microsofts InterlockedCompareExchange scores about 95 time units Ive also tried to use some inline assembly with __asm {} using something like this code and it scores

Does it make any sense to use the LFENCE instruction on x86/x86_64 processors?

阅读更多关于 Does it make any sense to use the LFENCE instruction on x86/x86_64 processors?

Often in internet I find that LFENCE makes no sense in processors x86, ie it does nothing , so instead MFENCE we can absolutely painless to use SFENCE , because MFENCE = SFENCE + LFENCE = SFENCE + NOP = SFENCE . But if LFENCE does not make sense, then why we have four approaches to make Sequential Consistency in x86/x86_64: LOAD (without fence) and STORE + MFENCE LOAD (without fence) and LOCK XCHG MFENCE + LOAD and STORE (without fence) LOCK XADD ( 0 ) and STORE (without fence) Taken from here: http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html As well as performances from Herb Sutter on

How many memory barriers instructions does an x86 CPU have?

阅读更多关于 How many memory barriers instructions does an x86 CPU have?

I have found out that an x86 CPU have the following memory barriers instructions: mfence , lfence , and sfence . Does an x86 CPU only have these three memory barriers instructions, or are there more? sfence (SSE1) and mfence / lfence (SSE2) are the only instructions that are named for their memory fence/barrier functionality . Unless you're using NT loads or stores and/or WC memory, only mfence is needed for memory ordering. (Note that lfence on Intel CPUs is also a barrier for out-of-order execution, so it can serialize rdtsc , and is useful for Spectre mitigation to prevent speculative

When should I use _mm_sfence _mm_lfence and _mm_mfence

阅读更多关于 When should I use _mm_sfence _mm_lfence and _mm_mfence

I read the "Intel Optimization guide Guide For Intel Architecture". However, I still have no idea about when should I use _mm_sfence() _mm_lfence() _mm_mfence() Could anyone explain when these should be used when writing multi-threaded code? Caveat : I'm no expert in this. I'm still trying to learn this myself. But since no one has replied in the past two days, it seems experts on memory fence instructions are not plentiful. So here's my understanding ... Intel is a weakly-ordered memory system. That means your program may execute array[idx+1] = something idx++ but the change to idx may be

Memory model ordering and visibility?

阅读更多关于 Memory model ordering and visibility?

问题 I tried looking for details on this, I even read the standard on mutexes and atomics... but still I couldnt understand the C++11 memory model visibility guarantees. From what I understand the very important feature of mutex BESIDE mutual exclusion is ensuring visibility. Aka it is not enough that only one thread per time is increasing the counter, it is important that the thread increases the counter that was stored by the thread that was last using the mutex(I really dont know why people

memory barrier and cache flush

阅读更多关于 memory barrier and cache flush

问题 Is there any archs where a memory barrier is implemented even with a cache flush? I read that memory barrier affects only CPU reordering but I read statements related to the memory barriers: ensures all the cpu will see the value... , but for me it means a cache flush/invalidation. 回答1: On pretty much all modern architectures, caches (like the L1 and L2 caches) are ensured coherent by hardware. There is no need to flush any cache to make memory visible to other CPUs. One could imagine

Globally Invisible load instructions

阅读更多关于 Globally Invisible load instructions

Can some of the load instructions be never globally visible due to store load forwarding ? To put it another way, if a load instruction gets its value from the store buffer, it never has to read from the cache. As it is generally stated that a load is globally visible when it reads from the L1D cache, the ones that do not read from the L1D should make it globally invisible. The concept of global visibility for loads is tricky, because a load doesn't modify the global state of memory, and other threads can't directly observe it. But once the dust settles after out-of-order / speculative

Variable freshness guarantee in .NET (volatile vs. volatile read)

阅读更多关于 Variable freshness guarantee in .NET (volatile vs. volatile read)

问题 I have read many contradicting information (msdn, SO etc.) about volatile and VoletileRead (ReadAcquireFence). I understand the memory access reordering restriction implication of those - what I'm still completely confused about is the freshness guarantee - which is very important for me. msdn doc for volatile mentions: (...) This ensures that the most up-to-date value is present in the field at all times. msdn doc for volatile fields mentions: A read of a volatile field is called a volatile