The C++11 standard defines a memory model (1.7, 1.10) which contains memory orderings, which are, roughly, \"sequentially-consistent\", \"acquire\", \"consume\", \"rele
Load-consume is much like load-acquire, except that it induces happens-before relationships only to expression evaluations that are data-dependent on the load-consume. Wrapping an expression with kill_dependency
results in a value that no longer carries a dependency from the load-consume.
The key use case is for the writer to construct a data structure sequentially, then swing a shared pointer to the new structure (using a release
or acq_rel
atomic). The reader uses load-consume to read the pointer, and dereferences it to get to the data structure. The dereference creates a data dependency, so the reader is guaranteed to see the initialized data.
std::atomic foo {nullptr};
std::atomic bar;
void thread1()
{
bar = 7;
int * x = new int {51};
foo.store(x, std::memory_order_release);
}
void thread2()
{
int *y = foo.load(std::memory_order_consume)
if (y)
{
assert(*y == 51); //succeeds
// assert(bar == 7); //undefined behavior - could race with the store to bar
// assert(kill_dependency(*y) + bar == 58) // undefined behavior (same reason)
assert(*y + bar == 58); // succeeds - evaluation of bar pulled into the dependency
}
}
There are two reasons for providing load-consume. The primary reason is that ARM and Power loads are guaranteed to consume, but require additional fencing to turn them into acquires. (On x86, all loads are acquires, so consume provides no direct performance advantage under naive compilation.) The secondary reason is that the compiler can move later operations without data dependence up to before the consume, which it can't do for an acquire. (Enabling such optimizations is the big reason for building all of this memory ordering into the language.)
Wrapping a value with kill_dependency
allows computation of an expression that depends on the value to be moved to before the load-consume. This is useful e.g. when the value is an index into an array that was previously read.
Note that the use of consume results in a happens-before relation that is no longer transitive (though it is still guaranteed to be acyclic). For example, the store to bar
happens before the store to foo, which happens before the dereference of y
, which happens before the read of bar
(in the commented-out assert), but the store to bar
doesn't happen before the read of bar
. This leads to a rather more complicated definition of happens-before, but you can imagine how it works (start with sequenced-before, then propagate through any number of release-consume-dataDependency or release-acquire-sequencedBefore links)