memory-fences

Is there an implicit memory barrier with synchronized-with relationship on thread::join?

纵饮孤独 提交于 2019-12-01 21:56:15
问题 I have a code at work that starts multiple threads that doing some operations and if any of them fail they set the shared variable to false . Then main thread joins all the worker threads. Simulation of this looks roughly like this (I commented out the possible fix which I don't know if it's needed): #include <thread> #include <atomic> #include <vector> #include <iostream> #include <cassert> using namespace std; //atomic_bool success = true; bool success = true; int main() { vector<thread> v;

clarifications on full memory barriers involved by pthread mutexes

夙愿已清 提交于 2019-12-01 21:09:34
I have heard that when dealing with mutexes, the necessary memory barriers are handled by the pthread API itself. I would like to have more details on this matter. Are these claimings true, at least on the most common architectures around? Does the compiler recognize this implicit barrier, and avoids reordering of operations/read from local registers when generating the code? When is the memory barrier applied: after successfully acquiring a mutex AND after releasing it? The POSIX specification lists the functions that must "synchronize memory with respect to other threads" , which includes

Do locked instructions provide a barrier between weakly-ordered accesses?

梦想的初衷 提交于 2019-12-01 19:52:11
问题 On x86, lock -prefixed instructions such as lock cmpxchg provide barrier semantics in addition to their atomic operation: for normal memory access on write-back memory regions, reads and writes are not re-ordered across lock -prefixed instructions, per section 8.2.2 of Volume 3 of the Intel SDM: Reads or writes cannot be reordered with I/O instructions, locked instructions, or serializing instructions. This section applies only to write-back memory types. In the same list, you find an

C++11 When To Use A Memory Fence?

百般思念 提交于 2019-12-01 13:39:36
I'm writing some threaded C++11 code, and I'm not totally sure on when I need to use a memory fence or something. So here is basically what I'm doing: class Worker { std::string arg1; int arg2; int arg3; std::thread thread; public: Worker( std::string arg1, int arg2, int arg3 ) { this->arg1 = arg1; this->arg2 = arg2; this->arg3 = arg3; } void DoWork() { this->thread = std::thread( &Worker::Work, this ); } private: Work() { // Do stuff with args } } int main() { Worker worker( "some data", 1, 2 ); worker.DoWork(); // Wait for it to finish return 0; } I was wondering, what steps do I need to

C++11 When To Use A Memory Fence?

左心房为你撑大大i 提交于 2019-12-01 11:12:28
问题 I'm writing some threaded C++11 code, and I'm not totally sure on when I need to use a memory fence or something. So here is basically what I'm doing: class Worker { std::string arg1; int arg2; int arg3; std::thread thread; public: Worker( std::string arg1, int arg2, int arg3 ) { this->arg1 = arg1; this->arg2 = arg2; this->arg3 = arg3; } void DoWork() { this->thread = std::thread( &Worker::Work, this ); } private: Work() { // Do stuff with args } } int main() { Worker worker( "some data", 1,

Is a memory barrier an instruction that the CPU executes, or is it just a marker?

点点圈 提交于 2019-11-30 11:13:48
I am trying to understand what is a memory barrier exactly. Based on what I know so far, a memory barrier (for example: mfence ) is used to prevent the re-ordering of instructions from before to after and from after to before the memory barrier. This is an example of a memory barrier in use: instruction 1 instruction 2 instruction 3 mfence instruction 4 instruction 5 instruction 6 Now my question is: Is the mfence instruction just a marker telling the CPU in what order to execute the instructions? Or is it an instruction that the CPU actually executes like it executes other instructions (for

Do we need mfence when using xchg

你说的曾经没有我的故事 提交于 2019-11-30 04:29:16
问题 I have a set and test xchg based assembly lock. my question is : Do we need to use memory fencing ( mfence , sfence or lfence ) when using xchg instruction ? Edit : 64 Bit platform : with Intel nehalem 回答1: As said in the other answers the lock prefix is implicit, here, so there is no problem on the assembler level. The problem may lay on the C (or C++) level when you use that as inline assembler. Here you have to ensure that the compiler doesn't reorder instructions with respect to your xchg

Intel 64 and IA-32 | Atomic operations including acquire / release semantic

廉价感情. 提交于 2019-11-30 02:29:22
According to the Intel 64 and IA-32 Architectures Software Developer's Manual the LOCK Signal Prefix "ensures that the processor has exclusive use of any shared memory while the signal is asserted". That can be a in the form of a bus or cache lock. But - and that's the reason I'm asking this question - it isn't clear to me, if this Prefix also provides any memory-barrier. I'm developing with NASM in a multi-processor environment and need to implement atomic operations with optional acquire and/or release semantics. So, do I need to use the MFENCE, SFENCE and LFENCE instructions or would this

Memory Fences - Need help to understand

允我心安 提交于 2019-11-29 23:18:54
I'm reading Memory Barriers by Paul E. McKenney http://www.rdrop.com/users/paulmck/scalability/paper/whymb.2010.07.23a.pdf everything is explained in great details and when I see that everything is clear I encounter one sentence, which stultifies everything and make me think that I understood nothing. Let me show the example void foo(void) { a = 1; #1 b = 1; #2 } void bar(void) { while (b == 0) continue; #3 assert(a == 1); #4 } let's say this two functions are running on a different processors. Now what could possibly happen is store to a #1 could be seen after store to b #2 by the second

Can non-atomic-load be reordered after atomic-acquire-load?

无人久伴 提交于 2019-11-29 21:09:19
问题 As known in since C++11 there are 6 memory orders, and in documentation written about std::memory_order_acquire : http://en.cppreference.com/w/cpp/atomic/memory_order memory_order_acquire A load operation with this memory order performs the acquire operation on the affected memory location: no memory accesses in the current thread can be reordered before this load. This ensures that all writes in other threads that release the same atomic variable are visible in the current thread. 1. Non