memory-model

c++, std::atomic, what is std::memory_order and how to use them?

淺唱寂寞╮ 提交于 2019-12-17 17:24:52
问题 Can anyone explain what is std::memory_order in plain English, and how to use them with std::atomic<> ? I found the reference and few examples here, but don't understand at all. http://en.cppreference.com/w/cpp/atomic/memory_order 回答1: Can anyone explain what is std::memory_order in plain English, The best "Plain English" explanation I've found for the various memory orderings is Bartoz Milewski's article on relaxed atomics: http://bartoszmilewski.com/2008/12/01/c-atomics-and-memory-ordering/

What do each memory_order mean?

血红的双手。 提交于 2019-12-17 10:11:43
问题 I read a chapter and I didn't like it much. I'm still unclear what the differences is between each memory order. This is my current speculation which I understood after reading the much more simple http://en.cppreference.com/w/cpp/atomic/memory_order The below is wrong so don't try to learn from it memory_order_relaxed: Does not sync but is not ignored when order is done from another mode in a different atomic var memory_order_consume: Syncs reading this atomic variable however It doesnt sync

Can modern x86 hardware not store a single byte to memory?

痞子三分冷 提交于 2019-12-17 01:13:13
问题 Speaking of the memory model of C++ for concurrency, Stroustrup's C++ Programming Language, 4th ed., sect. 41.2.1, says: ... (like most modern hardware) the machine could not load or store anything smaller than a word. However, my x86 processor, a few years old, can and does store objects smaller than a word. For example: #include <iostream> int main() { char a = 5; char b = 25; a = b; std::cout << int(a) << "\n"; return 0; } Without optimization, GCC compiles this as: [...] movb $5, -1(%rbp)

Why there need memory order limit on reference counter?

夙愿已清 提交于 2019-12-14 03:55:35
问题 In the example of boost::atomic , the unref function: void intrusive_ptr_release(const X * x) { if (x->refcount_.fetch_sub(1, boost::memory_order_release) == 1) { boost::atomic_thread_fence(boost::memory_order_acquire); delete x; } } 1: the fetch_sub op is limited by memory_order_release , which prevents preceding operations to be reordered past the point. But what are the possible scenes that would have such phenomenon? 2: in addition of memory_order_release on the atomic op, why there is an

x86: Are memory barriers needed here?

别等时光非礼了梦想. 提交于 2019-12-13 17:37:42
问题 In WB-memory, a = b = 0 P1: a = 1 SFENCE b = 1 P2: WHILE (b == 0) {} LFENCE ASSERT (a == 0) It is my understanding, that neither the SFENCE or LFENCE are needed here. Namely, since, for this memory type, x86 ensures: Reads cant be reordered with older reads Stores cant be reordered with older stores Stores are transitively visible 回答1: The lfence and sfence asm instructions are no-ops unless you're using NT stores (or NT loads from WC memory, e.g. video RAM). (Actually, movntdqa loads might

Concurrency and memory models

陌路散爱 提交于 2019-12-13 11:41:17
问题 I'm watching this video by Herb Sutter on GPGPU and the new C++ AMP library. He is talking about memory models and mentions Weak Memory Models and then Strong Memory Models and I think he's referring to read/write ordering etc, but I am however not sure. Google turns up some interesting results (mostly science papers) on memory models, but can someone explain what is a Weak Memory Model and what is a Strong Memory Model and their relation to concurrency? 回答1: In terms of concurrency, a memory

What guarantees that different unrelated objects in two unrelated threads don't have an (unavoidable) race condition?

风格不统一 提交于 2019-12-13 10:30:32
问题 When different threads only use unrelated objects and literally do not share anything they cannot have a race condition, right? Obviously. Actually all threads share something: the address space. There is no guarantee that a memory location that was used by one thread isn't going to be allocated at some other time to another thread. This can be true of memory for dynamically allocated objects or even for automatic objects: there is no prescription that the memory space for the "stacks" (the

Manual synchronization in OpenMP while loop

淺唱寂寞╮ 提交于 2019-12-12 15:08:59
问题 I recently started working with OpenMP to do some 'research' for an project in university. I have a rectangular and evenly spaced grid on which I'm solving a partial differential equation with an iterative scheme. So I basically have two for-loops (one in x- and y-direction of the grid each) wrapped by a while-loop for the iterations. Now I want to investigate different parallelization schemes for this. The first (obvious) approach was to do a spatial a parallelization on the for loops. Works

Cache coherence literature generally only refers store buffers but not read buffers. Yet one somehow needs both?

六眼飞鱼酱① 提交于 2019-12-12 08:58:56
问题 When reading about consistency models (namely the x86 TSO), authors in general resort to models where there are a bunch of CPUs, their associated store buffers and their private caches. If my understanding is correct, store buffers can be described as queues where CPUs may put any store instruction they want to commit to memory. So as the name states, they are store buffers. But when I read those papers, they tend to talk about the interaction of loads and stores, with statements such as "a

What does “happens before” mean in the C++11 spec?

柔情痞子 提交于 2019-12-11 10:09:06
问题 I'm trying to understand the meaning of happens before in the C++11 spec, and in particular whether the spec assumes any informal understanding of the term in addition to what is specified. I'm working from draft N3290. A straight-forward argument that the term should be interpreted only with respect to the specification itself is that the spec actually talks about its own definition of the term. For example, 1.10.6: "happens before (as defined below)" or 1.10.11: "'happens before' relation,