问题
I have this piece of code where I want to await on a ongoing task if that task was created for the same input. Here is minimal reproduction of what I'm doing.
private static ConcurrentDictionary<int, Task<int>> _tasks = new ConcurrentDictionary<int, Task<int>>();
private readonly ExternalService _service;
public async Task SampleTask(){
var result = await _service.DoSomething();
await Task.Delay(1000) //this task takes some time do finish
return result;
}
public async Task<int> DoTask(int key) {
var task = _tasks.GetOrAdd(key, _ => SampleTask());
var taskResult = await task;
_tasks.TryRemove(key, out task);
return taskResult;
}
I'm writing a test to ensure the same task is awaited when multiple requests want to perform the task at (roughly) the same time. I'm doing that by mocking _service and counting how many times _service.DoSomething() is being called. It should be only once if the calls to DoTask(int key) where made at roughly the same time.
However, the results show me that if I call DoTask(int key) more than once with a delay between calls of less than 1~2ms, both tasks will create and execute its on instance of SampleTask() with the second one replacing the first one in the dictionary.
Considering this, can we say that this method is truly thread-safe? Or isn't my problem a case of thread-safety per se?
回答1:
To quote the documentation (emphasis mine):
For modifications and write operations to the dictionary, ConcurrentDictionary<TKey,TValue> uses fine-grained locking to ensure thread safety. (Read operations on the dictionary are performed in a lock-free manner.) However, the
valueFactorydelegate is called outside the locks to avoid the problems that can arise from executing unknown code under a lock. Therefore, GetOrAdd is not atomic with regards to all other operations on theConcurrentDictionary<TKey,TValue>class.Since a key/value can be inserted by another thread while
valueFactoryis generating a value, you cannot trust that just becausevalueFactoryexecuted, its produced value will be inserted into the dictionary and returned. If you callGetOrAddsimultaneously on different threads,valueFactorymay be called multiple times, but only one key/value pair will be added to the dictionary.
So while the dictionary is properly thread-safe, calls to the valueFactory, or _ => SampleTask() in your case, are not guaranteed to be unique. So your factory function should be able to live with that fact.
You can confirm this from the source:
public TValue GetOrAdd(TKey key, Func<TKey, TValue> valueFactory)
{
if (key == null) throw new ArgumentNullException("key");
if (valueFactory == null) throw new ArgumentNullException("valueFactory");
TValue resultingValue;
if (TryGetValue(key, out resultingValue))
{
return resultingValue;
}
TryAddInternal(key, valueFactory(key), false, true, out resultingValue);
return resultingValue;
}
As you can see, valueFactory is being called outside of TryAddInternal which is responsible of locking the dictionary properly.
However, since valueFactory is a lambda function that returns a task in your case (_ => SampleTask()), and the dictionary will not await that task itself, the function will finish quickly and just return the incomplete Task after encountering the first await (when the async state machine is set up). So unless the calls are very quickly after another, the task should be added very quickly to the dictionary and subsequent calls will reuse the same task.
If you require this to happen just once in all cases, you should consider locking on the task creation yourself. Since it will finish quickly (regardless of how long your task actually takes to resolve), locking will not hurt that much.
来源:https://stackoverflow.com/questions/53814980/is-concurrentdictionary-getoradd-truly-thread-safe