memory-barriers | 易学教程

How to achieve a StoreLoad barrier in C++11?

阅读更多关于 How to achieve a StoreLoad barrier in C++11?

问题 I want to write portable code (Intel, ARM, PowerPC...) which solves a variant of a classic problem: Initially: X=Y=0 Thread A: X=1 if(!Y){ do something } Thread B: Y=1 if(!X){ do something } in which the goal is to avoid a situation in which both threads are doing something . (It's fine if neither thing runs; this isn't a run-exactly-once mechanism.) Please correct me if you see some flaws in my reasoning below. I am aware, that I can achieve the goal with memory_order_seq_cst atomic store s

C11 Standalone memory barriers LoadLoad StoreStore LoadStore StoreLoad

阅读更多关于 C11 Standalone memory barriers LoadLoad StoreStore LoadStore StoreLoad

问题 I want to use standalone memory barriers between atomic and non-atomic operations (I think it shouldn't matter at all anyway). I think I understand what a store barrier and a load barrier mean and also the 4 types of possible memory reorderings; LoadLoad , StoreStore , LoadStore , StoreLoad . However, I always find the acquire/release concepts confusing. Because when reading the documentation, acquire doesn't only speak about loads, but also stores, and release doesn't only speak about stores

Why is LOCK a full barrier on x86?

阅读更多关于 Why is LOCK a full barrier on x86?

问题 Why does the LOCK prefix cause a full barrier on x86? (And thus it drains the store buffer and has sequential consistency) For LOCK /read-modify-write operations, a full barrier shouldn't be required and exclusive access to the cache line seems to be sufficient. Is it a design choice or is there some other limitation? 回答1: Long time ago, before the Intel 80486, Intel processors didn't have on-chip caches or write buffers. Therefore, by design, all writes become immediately globally visible in

Why does using MFENCE with store instruction block prefetching in L1 cache?

阅读更多关于 Why does using MFENCE with store instruction block prefetching in L1 cache?

问题 I have an object of 64 byte in size: typedef struct _object{ int value; char pad[60]; } object; in main I am initializing array of object: volatile object * array; int arr_size = 1000000; array = (object *) malloc(arr_size * sizeof(object)); for(int i=0; i < arr_size; i++){ array[i].value = 1; _mm_clflush(&array[i]); } _mm_mfence(); Then loop again through each element. This is the loop I am counting events for: int tmp; for(int i=0; i < arr_size-105; i++){ array[i].value = 2; //tmp = array[i

C11 Atomic Acquire/Release and x86_64 lack of load/store coherence?

阅读更多关于 C11 Atomic Acquire/Release and x86_64 lack of load/store coherence?

问题 I am struggling with Section 5.1.2.4 of the C11 Standard, in particular the semantics of Release/Acquire. I note that https://preshing.com/20120913/acquire-and-release-semantics/ (amongst others) states that: ... Release semantics prevent memory reordering of the write-release with any read or write operation that precedes it in program order. So, for the following: typedef struct test_struct { _Atomic(bool) ready ; int v1 ; int v2 ; } test_struct_t ; extern void test_init(test_struct_t* ts,

C11 Atomic Acquire/Release and x86_64 lack of load/store coherence?

阅读更多关于 C11 Atomic Acquire/Release and x86_64 lack of load/store coherence?

C# SemaphoreSlim array elements read/write synchronization [closed]

阅读更多关于 C# SemaphoreSlim array elements read/write synchronization [closed]

问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 5 days ago . My blocking ring queue don't use lock or Mutex , only two SemaphoreSlim (block than 0 and than max element, so write and read part of array never intersects) and two int indexes modified by Interlocked.Decrement (not determine write/read index, but make it unique and correct move).

C# SemaphoreSlim array elements read/write synchronization [closed]

阅读更多关于 C# SemaphoreSlim array elements read/write synchronization [closed]

C# SemaphoreSlim array elements read/write synchronization [closed]

阅读更多关于 C# SemaphoreSlim array elements read/write synchronization [closed]

How to guarantee that load completes before store occurs?

阅读更多关于 How to guarantee that load completes before store occurs?

问题 In the following code, how could one ensure that ptr not incremented until after *ptr has been loaded/assigned/"extracted"? extern int arr[some_constexpr]; // assume pre-populated extern int* ptr; // assume points to non-atomic arr int a = *ptr; // want "memory barrier/fence" here ++ptr; Would an atomic pointer ensure the correct ordering/sequencing? #include <atomic> extern int arr[some_constexpr]; extern std::atomic<int*> ptr; int a = *(ptr.load()); // implicit "memory barrier" achieved