问题
The following piece of code is what one can get after they significantly simplify the hazard-pointer algorithm (introduced in this paper). Because of the gross amount of simplification, it cannot be used in place of the algorithm (and one does not need to know anything about the algorithm to answer this question). However, I believe it still perfectly represents the memory-ordering challenge in the original algorithm.
So the question is what is the best memory-ordering so that if ptr->a = 1;
gets executed, the result won't be undefined (values of order1
... order5
)?
struct T { int a = 0; };
static_assert(std::is_trivially_destructible_v<T>);
std::atomic<T*> a{new T()};
std::atomic<T*> h{nullptr};
// Thread 1
auto ptr = a.load(order1);
h.store(ptr,order2);
if(ptr == nullptr || ptr != a.load(order3))
return;
ptr->a = 1;
// Thread 2
auto ptr = a.exchange(nullptr,order4);
if(ptr != h.load(order5))
delete ptr;
We know for ptr->a=1;
to get executed, a.exchange
must happen after the 2nd a.load
(even relaxed-memory ordering guarantees this). However, the problem is how to ensure h.load
will see the effect of h.store
. I cannot figure out why the code works even if we only use sequential memory-ordering everywhere.
回答1:
For simplicity, these paper usually assume a sequential consistent memory model - that is also the case for the paper you referenced. Your example is highly simplified, but it still contains the gist of hazard pointer algorithm. You have to ensure that either Thread 2 "sees" the hazard pointer stored by Thread 1 (i.e., Thread 1 has acquired a safe reference), or Thread 1 sees the updated value of a.
In my argument I will use the following notation
- a -sb-> b
means "a is sequenced before b"
- a -sco-> b
means "a precedes b in the single total order S of all sequential consistent operations"
- a -rf-> b
means "b reads the value written by a" (reads-from)
Let's assume that all atomic operations are sequentially consistent. That would give the following situation:
- Thread 1:
a.load() -sb-> h.store() -sb-> a.load() -sb-> ptr->a=1
- Thread 2:
a.exchange() -sb-> h.load() -> delete ptr
Since sequential consistent operations are totally ordered, we have to consider two cases:
h.store() -sco-> h.load()
This impliesh.store() -rf-> h.load()
, i.e., Thread 2 is guaranteed to "see" the hazard pointer written be Thread 1, so it does not delete the ptr (and Thread 1 can therefore safely updateptr->a
).h.load() -sco-> h.store()
Because we also havea.exchange() -sb-> h.load()
(Thread 2) andh.store() -sb-> a.load()
(Thread 1), this implies thata.exchange() -sco-> a.load()
and thereforea.exchange() -rf-> a.load()
, i.e., Thread 1 is guaranteed to "see" the updated value ofa
(and therefore does not attempt to updateptr->a
).
So if all operations are sequentially consistent, the algorithm works as intended. But what if we cannot (or don't want to) assume that all operations are sequentially consistent? Can we relax some operations? The problem is that we have to ensure visibility between two different variables (a
and h
) in two different thread, and this requires stronger guarantees then acquire/release can provide. However, it is possible to relax the operations if you introduce sequentially consistent fences:
// Thread 1
auto ptr = a.load(std::memory_order_acquire);
h.store(ptr, std::memory_order_relaxed);
std::atomic_thread_fence(std::memory_order_seq_cst);
if(ptr == nullptr || ptr != a.load(std::memory_order_relaxed))
return;
ptr->a = 1;
// Thread 2
auto ptr = a.exchange(nullptr, std::memory_order_relaxed);
std::atomic_thread_fence(std::memory_order_seq_cst);
if(ptr != h.load(std::memory_order_relaxed))
delete ptr;
So we have the following situation:
- Thread 1:
a.load() -sb-> h.store() -sb-> fence() -sb-> a.load() -sb-> ptr->a=1
- Thread 2:
a.exchange() -sb-> fence() -sb-> h.load() -> delete ptr
The standard states:
For atomic operations A and B on an atomic object M, where A modifies M and B takes its value, if there are memory_order_seq_cst fences X and Y such that A is sequenced before X, Y is sequenced before B, and X precedes Y in S, then B observes either the effects of A or a later modification of M in its modification order.
The fences are also part of the single total order S so we again have two cases to consider:
Thread1 fence -sco-> Thread 2 fence
Sinceh.store() -sb-> fence()
(Thread 1) andfence() -sb-> h.load()
(Thread 2) it is guaranteed that Thread 2 "sees" the hazard pointer written by Thread 1.Thread 2 fence -sco-> Thread 1 fence
Sincea.exchange() -sb-> fence()
(Thread 2) andfence() -sb-> a.load()
(Thread 1) it is guaranteed that Thread 1 "sees" the updated value ofa
.
The later version is exactly how I have implemented hazard pointers in my xenium library.
来源:https://stackoverflow.com/questions/62240646/memory-ordering-for-hazard-pointers