memory-barriers | 易学教程

Why does ThreadSanitizer report a race with this lock-free example?

阅读更多关于 Why does ThreadSanitizer report a race with this lock-free example?

问题 I've boiled this down to a simple self-contained example. The main thread enqueues 1000 items, and a worker thread tries to dequeue concurrently. ThreadSanitizer complains that there's a race between the read and the write of one of the elements, even though there is an acquire-release memory barrier sequence protecting them. #include <atomic> #include <thread> #include <cassert> struct FakeQueue { int items[1000]; std::atomic<int> m_enqueueIndex; int m_dequeueIndex; FakeQueue() : m

Threads synchronization. How exactly lock makes access to memory 'correct'?

阅读更多关于 Threads synchronization. How exactly lock makes access to memory 'correct'?

问题 First of all, I know that lock{} is synthetic sugar for Monitor class. (oh, syntactic sugar) I was playing with simple multithreading problems and discovered that cannot totally understand how lockng some arbitrary WORD of memory secures whole other memory from being cached is registers/CPU cache etc. It's easier to use code samples to explain what I'm saying about: for (int i = 0; i < 100 * 1000 * 1000; ++i) { ms_Sum += 1; } In the end ms_Sum will contain 100000000 which is, of course,

Is a memory barrier required if a second thread waits for termination of the first one?

阅读更多关于 Is a memory barrier required if a second thread waits for termination of the first one?

问题 Suppose that thread Alpha is writing to variable A without locking. A second thread Beta is waiting for Alpha to terminate, then reads the variable A in turn. Is it possible that the contents of A will not be fresh? Can memory writes be delayed beyond the thread lifetime? Won't the standard mechanism of waiting for thread Alpha termination implicitly work as a memory barrier? UPDATE 1 Are there any examples of waiting which does not include a memory barrier? 回答1: Almost certainly (the API

Determining the location for the usage of barriers (fences)

阅读更多关于 Determining the location for the usage of barriers (fences)

问题 The x86 instructions lfence/sfence/mfence are used to implement the rmb()/wmb()/mb() mechanisms in the Linux kernel. It is easy to understand that these are used to serialize the memory accesses. However, it is much more difficult to determine when and where to use these while writing the code -- before encountering the bug in the runtime behavior. I was interested to know if there are known caveats that could be checked, while writing/reviewing the code, that can help us determine where the

Do memory barriers guarantee a fresh read in C#?

阅读更多关于 Do memory barriers guarantee a fresh read in C#?

问题 If we have the following code in C#: int a = 0; int b = 0; void A() // runs in thread A { a = 1; Thread.MemoryBarrier(); Console.WriteLine(b); } void B() // runs in thread B { b = 1; Thread.MemoryBarrier(); Console.WriteLine(a); } The MemoryBarriers make sure that the write instruction takes place before the read. However, is it guaranteed that the write of one thread is seen by the read on the other thread? In other words, is it guaranteed that at least one thread prints 1 or both thread

Using time stamp counter and clock_gettime for cache miss

阅读更多关于 Using time stamp counter and clock_gettime for cache miss

问题 As a follow-up to this topic, in order to calculate the memory miss latency, I have wrote the following code using _mm_clflush , __rdtsc and _mm_lfence (which is based on the code from this question/answer). As you can see in the code, I first load the array into the cache. Then I flush one element and therefore the cache line is evicted from all cache levels. I put _mm_lfence in order to preserve the order during -O3 . Next, I used time stamp counter to calculate the latency or reading array

When should I use _mm_sfence _mm_lfence and _mm_mfence

阅读更多关于 When should I use _mm_sfence _mm_lfence and _mm_mfence

问题 I read the "Intel Optimization guide Guide For Intel Architecture". However, I still have no idea about when should I use _mm_sfence() _mm_lfence() _mm_mfence() Could anyone explain when these should be used when writing multi-threaded code? 回答1: Caveat : I'm no expert in this. I'm still trying to learn this myself. But since no one has replied in the past two days, it seems experts on memory fence instructions are not plentiful. So here's my understanding ... Intel is a weakly-ordered memory

Are memory fences required here?

阅读更多关于 Are memory fences required here?

问题 Consider this code (extracted from Simple-Web-Server, but knowledge of the library shouldn't be necessary to answer this question): HttpServer server; thread server_thread; server.config.port = 8080; server.default_resource["GET"] = [](shared_ptr<HttpServer::Response> response, shared_ptr<HttpServer::Request> request) { string content = "Hello world!" *response << "HTTP/1.1 200 OK\r\nContent-Length: " << content.size() << "\r\n\r\n" << content; }; server_thread = thread([&server]() { server

How can I experience “LFENCE or SFENCE can not pass earlier read/write”

阅读更多关于 How can I experience “LFENCE or SFENCE can not pass earlier read/write”

问题 I'm doing something about function safety. I need verify some X86 CPU instructions, such as LFENCE, SFENCE and MFENCE. Now I can experience MFENCE according to Intel SDM chapter 8.2.3.4 "loads may be reordered with earlier store to different location". "xor %0, %0\n\t " "movl $1, %1\n\t " "mfence\n\t " "movl %2, %0\n\t " : "=r"(r1), "=m" (X) : "m"(Y) : "memory"); "xor %0, %0\n\t " "movl $1, %1\n\t " "mfence\n\t " "movl %2, %0\n\t " : "=r"(r2), "=m" (Y) : "m"(X) : "memory"); Above code only

Why does this acquire and release memory fence not give a consistent value?

阅读更多关于 Why does this acquire and release memory fence not give a consistent value?

问题 im just exploring the use of acquire and release memory fences and dont understand why i get the value output to zero sometimes and not the value of 2 all the time I ran the program a number of times , and assumed the atomic store before the release barrier and the atomic load after the acquire barrier would ensure the values always would synchronise #include <iostream> #include <thread> #include <atomic> std::atomic<int>x; void write() { x.store(2,std::memory_order_relaxed); std::atomic