memory-barriers

Why does ThreadSanitizer report a race with this lock-free example?

試著忘記壹切 提交于 2019-12-18 11:50:19
问题 I've boiled this down to a simple self-contained example. The main thread enqueues 1000 items, and a worker thread tries to dequeue concurrently. ThreadSanitizer complains that there's a race between the read and the write of one of the elements, even though there is an acquire-release memory barrier sequence protecting them. #include <atomic> #include <thread> #include <cassert> struct FakeQueue { int items[1000]; std::atomic<int> m_enqueueIndex; int m_dequeueIndex; FakeQueue() : m

Threads synchronization. How exactly lock makes access to memory 'correct'?

懵懂的女人 提交于 2019-12-18 11:49:52
问题 First of all, I know that lock{} is synthetic sugar for Monitor class. (oh, syntactic sugar) I was playing with simple multithreading problems and discovered that cannot totally understand how lockng some arbitrary WORD of memory secures whole other memory from being cached is registers/CPU cache etc. It's easier to use code samples to explain what I'm saying about: for (int i = 0; i < 100 * 1000 * 1000; ++i) { ms_Sum += 1; } In the end ms_Sum will contain 100000000 which is, of course,

Is a memory barrier required if a second thread waits for termination of the first one?

不打扰是莪最后的温柔 提交于 2019-12-18 09:45:52
问题 Suppose that thread Alpha is writing to variable A without locking. A second thread Beta is waiting for Alpha to terminate, then reads the variable A in turn. Is it possible that the contents of A will not be fresh? Can memory writes be delayed beyond the thread lifetime? Won't the standard mechanism of waiting for thread Alpha termination implicitly work as a memory barrier? UPDATE 1 Are there any examples of waiting which does not include a memory barrier? 回答1: Almost certainly (the API

Determining the location for the usage of barriers (fences)

守給你的承諾、 提交于 2019-12-18 07:14:42
问题 The x86 instructions lfence/sfence/mfence are used to implement the rmb()/wmb()/mb() mechanisms in the Linux kernel. It is easy to understand that these are used to serialize the memory accesses. However, it is much more difficult to determine when and where to use these while writing the code -- before encountering the bug in the runtime behavior. I was interested to know if there are known caveats that could be checked, while writing/reviewing the code, that can help us determine where the

Do memory barriers guarantee a fresh read in C#?

淺唱寂寞╮ 提交于 2019-12-18 06:56:42
问题 If we have the following code in C#: int a = 0; int b = 0; void A() // runs in thread A { a = 1; Thread.MemoryBarrier(); Console.WriteLine(b); } void B() // runs in thread B { b = 1; Thread.MemoryBarrier(); Console.WriteLine(a); } The MemoryBarriers make sure that the write instruction takes place before the read. However, is it guaranteed that the write of one thread is seen by the read on the other thread? In other words, is it guaranteed that at least one thread prints 1 or both thread

Using time stamp counter and clock_gettime for cache miss

[亡魂溺海] 提交于 2019-12-17 22:28:42
问题 As a follow-up to this topic, in order to calculate the memory miss latency, I have wrote the following code using _mm_clflush , __rdtsc and _mm_lfence (which is based on the code from this question/answer). As you can see in the code, I first load the array into the cache. Then I flush one element and therefore the cache line is evicted from all cache levels. I put _mm_lfence in order to preserve the order during -O3 . Next, I used time stamp counter to calculate the latency or reading array

When should I use _mm_sfence _mm_lfence and _mm_mfence

家住魔仙堡 提交于 2019-12-17 06:10:13
问题 I read the "Intel Optimization guide Guide For Intel Architecture". However, I still have no idea about when should I use _mm_sfence() _mm_lfence() _mm_mfence() Could anyone explain when these should be used when writing multi-threaded code? 回答1: Caveat : I'm no expert in this. I'm still trying to learn this myself. But since no one has replied in the past two days, it seems experts on memory fence instructions are not plentiful. So here's my understanding ... Intel is a weakly-ordered memory

Are memory fences required here?

眉间皱痕 提交于 2019-12-13 16:39:38
问题 Consider this code (extracted from Simple-Web-Server, but knowledge of the library shouldn't be necessary to answer this question): HttpServer server; thread server_thread; server.config.port = 8080; server.default_resource["GET"] = [](shared_ptr<HttpServer::Response> response, shared_ptr<HttpServer::Request> request) { string content = "Hello world!" *response << "HTTP/1.1 200 OK\r\nContent-Length: " << content.size() << "\r\n\r\n" << content; }; server_thread = thread([&server]() { server

How can I experience “LFENCE or SFENCE can not pass earlier read/write”

两盒软妹~` 提交于 2019-12-13 16:27:08
问题 I'm doing something about function safety. I need verify some X86 CPU instructions, such as LFENCE, SFENCE and MFENCE. Now I can experience MFENCE according to Intel SDM chapter 8.2.3.4 "loads may be reordered with earlier store to different location". "xor %0, %0\n\t " "movl $1, %1\n\t " "mfence\n\t " "movl %2, %0\n\t " : "=r"(r1), "=m" (X) : "m"(Y) : "memory"); "xor %0, %0\n\t " "movl $1, %1\n\t " "mfence\n\t " "movl %2, %0\n\t " : "=r"(r2), "=m" (Y) : "m"(X) : "memory"); Above code only

Why does this acquire and release memory fence not give a consistent value?

老子叫甜甜 提交于 2019-12-12 22:46:35
问题 im just exploring the use of acquire and release memory fences and dont understand why i get the value output to zero sometimes and not the value of 2 all the time I ran the program a number of times , and assumed the atomic store before the release barrier and the atomic load after the acquire barrier would ensure the values always would synchronise #include <iostream> #include <thread> #include <atomic> std::atomic<int>x; void write() { x.store(2,std::memory_order_relaxed); std::atomic