When implementing a class intended to be thread-safe, should I include a memory barrier at the end of its constructor, in order to ensure that any internal structures have completed being initialized before they can be accessed? Or is it the responsibility of the consumer to insert the memory barrier before making the instance available to other threads?
Simplified question:
Is there a race hazard in the code below that could give erroneous behaviour due to the lack of a memory barrier between the initialization and the access of the thread-safe class? Or should the thread-safe class itself protect against this?
ConcurrentQueue<int> queue = null;
Parallel.Invoke(
() => queue = new ConcurrentQueue<int>(),
() => queue?.Enqueue(5));
Note that it is acceptable for the program to enqueue nothing, as would happen if the second delegate executes before the first. (The null-conditional operator ?.
protects against a NullReferenceException
here.) However, it should not be acceptable for the program to throw an IndexOutOfRangeException
, NullReferenceException
, enqueue 5
multiple times, get stuck in an infinite loop, or do any of the other weird things caused by race hazards on internal structures.
Elaborated question:
Concretely, imagine that I were implementing a simple thread-safe wrapper for a queue. (I'm aware that .NET already provides ConcurrentQueue<T>
; this is just an example.) I could write:
public class ThreadSafeQueue<T>
{
private readonly Queue<T> _queue;
public ThreadSafeQueue()
{
_queue = new Queue<T>();
// Thread.MemoryBarrier(); // Is this line required?
}
public void Enqueue(T item)
{
lock (_queue)
{
_queue.Enqueue(item);
}
}
public bool TryDequeue(out T item)
{
lock (_queue)
{
if (_queue.Count == 0)
{
item = default(T);
return false;
}
item = _queue.Dequeue();
return true;
}
}
}
This implementation is thread-safe, once initialized. However, if the initialization itself is raced by another consumer thread, then race hazards could arise, whereby the latter thread would access the instance before the internal Queue<T>
has been initialized. As a contrived example:
ThreadSafeQueue<int> queue = null;
Parallel.For(0, 10000, i =>
{
if (i == 0)
queue = new ThreadSafeQueue<int>();
else if (i % 2 == 0)
queue?.Enqueue(i);
else
{
int item = -1;
if (queue?.TryDequeue(out item) == true)
Console.WriteLine(item);
}
});
It is acceptable for the code above to miss some numbers; however, without the memory barrier, it could also be getting a NullReferenceException
(or some other weird result) due to the internal Queue<T>
not having been initialized by the time that Enqueue
or TryDequeue
are called.
Is it the responsibility of the thread-safe class to include a memory barrier at the end of its constructor, or is it the consumer who should include a memory barrier between the class's instantiation and its visibility to other threads? What is the convention in the .NET Framework for classes marked as thread-safe?
Edit: This is an advanced threading topic, so I understand the confusion in some of the comments. An instance can appear as half-baked if accessed from other threads without proper synchronization. This topic is discussed extensively within the context of double-checked locking, which is broken under the ECMA CLI specification without the use of memory barriers (such as through volatile
). Per Jon Skeet:
The Java memory model doesn't ensure that the constructor completes before the reference to the new object is assigned to instance. The Java memory model underwent a reworking for version 1.5, but double-check locking is still broken after this without a volatile variable (as in C#).
Without any memory barriers, it's broken in the ECMA CLI specification too. It's possible that under the .NET 2.0 memory model (which is stronger than the ECMA spec) it's safe, but I'd rather not rely on those stronger semantics, especially if there's any doubt as to the safety.
Lazy<T>
is a very good choice for Thread-Safe Initialization. I think it should be left to the consumer to provide that:
var queue = new Lazy<ThreadSafeQueue<int>>(() => new ThreadSafeQueue<int>());
Parallel.For(0, 10000, i =>
{
else if (i % 2 == 0)
queue.Value.Enqueue(i);
else
{
int item = -1;
if (queue.Value.TryDequeue(out item) == true)
Console.WriteLine(item);
}
});
Unrelated, but still interesting that in Java
for all final fields that are written inside the constructor there would two fences written after the constructor exists: StoreStore
and LoadStore
- that would make publishing the reference thread-safe.
In answer to your simplified question:
ConcurrentQueue<int> queue = null;
Parallel.Invoke(
() => queue = new ConcurrentQueue<int>(),
() => queue?.Enqueue(5));
It is definitely possible that your code could try to call queue.Enqueue(5)
before queue
has a value, but it isn't anything you could protect against from within the constructor of Queue
. queue
won't actually be assigned a reference to the new instance until the constructor completes.
No, you don't need memory barrier in the constructor. Your assumption, even though demonstrating some creative thought - is wrong. No thread can get a half backed instance of queue
. The new reference is "visible" to the other threads only when the initialization is done. Suppose thread_1 is the first thread to initialize queue
- it goes through the ctor code, but queue
's reference in the main stack is still null! only when thread_1 exists the constructor code it assigns the reference.
See comments below and OP elaborated question.
来源:https://stackoverflow.com/questions/38881722/should-thread-safe-class-have-a-memory-barrier-at-the-end-of-its-constructor