memory-barriers

For purposes of ordering, is atomic read-modify-write one operation or two?

痴心易碎 提交于 2021-01-22 06:18:01
问题 Consider an atomic read-modify-write operation such as x.exchange(..., std::memory_order_acq_rel) . For purposes of ordering with respect to loads and stores to other objects, is this treated as: a single operation with acquire-release semantics? Or, as an acquire load followed by a release store, with the added guarantee that other loads and stores to x will observe both of them or neither? If it's #2, then although no other operations in the same thread could be reordered before the load or

C++ How is release-and-acquire achieved on x86 only using MOV?

帅比萌擦擦* 提交于 2021-01-20 03:48:29
问题 This question is a follow-up/clarification to this: Does the MOV x86 instruction implement a C++11 memory_order_release atomic store? This states the MOV assembly instruction is sufficient to perform acquire-release semantics on x86. We do not need LOCK , fences or xchg etc. However, I am struggling to understand how this works. Intel doc Vol 3A Chapter 8 states: https://software.intel.com/sites/default/files/managed/7c/f1/253668-sdm-vol-3a.pdf In a single-processor (core) system.... Reads

C++ How is release-and-acquire achieved on x86 only using MOV?

徘徊边缘 提交于 2021-01-20 03:44:36
问题 This question is a follow-up/clarification to this: Does the MOV x86 instruction implement a C++11 memory_order_release atomic store? This states the MOV assembly instruction is sufficient to perform acquire-release semantics on x86. We do not need LOCK , fences or xchg etc. However, I am struggling to understand how this works. Intel doc Vol 3A Chapter 8 states: https://software.intel.com/sites/default/files/managed/7c/f1/253668-sdm-vol-3a.pdf In a single-processor (core) system.... Reads

Can mutex replace memory barriers

℡╲_俬逩灬. 提交于 2021-01-08 15:27:57
问题 I was trying to understand memory barrier and came across the below wikipedia link http://en.wikipedia.org/wiki/Memory_barrier This explain the concept well but had thoughts if this is really helpful in system where we have mutex() locking the memory section. Taking the same code as mentioned in wikipedia, will the below approach solve the problem using mutex? [Note: Function names are not specific to any programming language, just used for simplicity sake] Processor #1 mutex_lock(a) while (f

Can mutex replace memory barriers

好久不见. 提交于 2021-01-08 15:25:51
问题 I was trying to understand memory barrier and came across the below wikipedia link http://en.wikipedia.org/wiki/Memory_barrier This explain the concept well but had thoughts if this is really helpful in system where we have mutex() locking the memory section. Taking the same code as mentioned in wikipedia, will the below approach solve the problem using mutex? [Note: Function names are not specific to any programming language, just used for simplicity sake] Processor #1 mutex_lock(a) while (f

How does Intel X86 implements total order over stores

微笑、不失礼 提交于 2021-01-05 09:16:06
问题 X86 guarantees a total order over all stores due to its TSO memory model. My question is if anyone has an idea how this is actually implemented. I have a good impression how all the 4 fences are implemented, so I can explain how local order is preserved. But the 4 fences will just give PO; it won't give you TSO (I know TSO allows older stores to jump in front of newer loads so only 3 out of 4 fences are needed). Total order over all memory actions over a single address is responsibility of

Java, volatile and memory barriers on x86 architecture

狂风中的少年 提交于 2021-01-02 07:18:52
问题 This is more of a theoretical question. I'm not sure if all concepts, compiler behaviors, etc. are uptodate and still in use, but I'd like to have confirmation if I'm correctly understanding some concepts I'm trying to learn. Language is Java. From what I've understood so far, on X86 architecture, StoreLoad barriers (despite the exact CPU instructions used to implement them) are put after Volatile writes, to make them visible to subsequent Volatile Reads in other threads (since x86 doesn't

Java, volatile and memory barriers on x86 architecture

流过昼夜 提交于 2021-01-02 07:18:22
问题 This is more of a theoretical question. I'm not sure if all concepts, compiler behaviors, etc. are uptodate and still in use, but I'd like to have confirmation if I'm correctly understanding some concepts I'm trying to learn. Language is Java. From what I've understood so far, on X86 architecture, StoreLoad barriers (despite the exact CPU instructions used to implement them) are put after Volatile writes, to make them visible to subsequent Volatile Reads in other threads (since x86 doesn't

Java, volatile and memory barriers on x86 architecture

 ̄綄美尐妖づ 提交于 2021-01-02 07:18:19
问题 This is more of a theoretical question. I'm not sure if all concepts, compiler behaviors, etc. are uptodate and still in use, but I'd like to have confirmation if I'm correctly understanding some concepts I'm trying to learn. Language is Java. From what I've understood so far, on X86 architecture, StoreLoad barriers (despite the exact CPU instructions used to implement them) are put after Volatile writes, to make them visible to subsequent Volatile Reads in other threads (since x86 doesn't

Does memory fencing blocks threads in multi-core CPUs?

蓝咒 提交于 2020-12-29 13:54:34
问题 I was reading the Intel instruction set guide 64-ia-32 guide to get an idea on memory fences. My question is that for an example with SFENCE, in order to make sure that all store operations are globally visible, does the multi-core CPU parks all the threads even running on other cores till the cache coherence achieved ? 回答1: Barriers don't make other threads/cores wait. They make some operations in the current thread wait , depending on what kind of barrier it is. Out-of-order execution of