Background:
I maintain several Winforms apps and class libraries that either could or already do benefit from caching. I\'m also aware of the Cachi
It looks like the .NET 4.0 concurrent collections utilize new synchronization primitives that spin before switching context, in case a resource is freed quickly. So they're still locking, just in a more opportunistic way. If you think you data retrieval logic is shorter than the timeslice, then it seems like this would be highly beneficial. But you mentioned network, which makes me think this doesn't apply.
I would wait till you have a simple, synchronized solution in place, and measure the performance and behavior before assuming you will have performance issues related to concurrency.
If you're really concerned about cache contention, you can utilize an existing cache infrastructure and logically partition it into regions. Then synchronize access to each region independently.
An example strategy if your data set consists of items that are keyed on numeric IDs, and you want to partition your cache into 10 regions, you can (mod 10) the ID to determine which region they are in. You'd keep an array of 10 objects to lock on. All of the code can be written for a variable number of regions, which can be set via configuration, or determined at app start depending on the total number of items you predict/intend to cache.
If your cache hits are keyed in an abnormal way, you'll have to come up with some custom heuristic to partition the cache.
Update (per comment):
Well this has been fun. I think the following is about as fine-grained locking as you can hope for without going totally insane (or maintaining/synchronizing a dictionary of locks for each cache key). I haven't tested it so there are probably bugs, but the idea should be illustrated. Track a list of requested IDs, and then use that to decide if you need to get the item yourself, or if you merely need to wait for a previous request to finish. Waiting (and cache insertion) is synchronized with tightly-scoped thread blocking and signaling using Wait and PulseAll. Access to the requested ID list is synchronized with a tightly-scopedReaderWriterLockSlim.
This is a read-only cache. If you doing creates/updates/deletes, you'll have to make sure you remove IDs from requestedIds once they're received (before the call to Monitor.PulseAll(_cache) you'll want to add another try..finally and acquire the _requestedIdsLock write-lock). Also, with creates/updates/deletes, the easiest way to manage the cache would be to merely remove the existing item from _cache if/when the underlying create/update/delete operation succeeds.
(Oops, see update 2 below.)
public class Item
{
public int ID { get; set; }
}
public class AsyncCache
{
protected static readonly Dictionary _externalDataStoreProxy = new Dictionary();
protected static readonly Dictionary _cache = new Dictionary();
protected static readonly HashSet _requestedIds = new HashSet();
protected static readonly ReaderWriterLockSlim _requestedIdsLock = new ReaderWriterLockSlim();
public Item Get(int id)
{
// if item does not exist in cache
if (!_cache.ContainsKey(id))
{
_requestedIdsLock.EnterUpgradeableReadLock();
try
{
// if item was already requested by another thread
if (_requestedIds.Contains(id))
{
_requestedIdsLock.ExitUpgradeableReadLock();
lock (_cache)
{
while (!_cache.ContainsKey(id))
Monitor.Wait(_cache);
// once we get here, _cache has our item
}
}
// else, item has not yet been requested by a thread
else
{
_requestedIdsLock.EnterWriteLock();
try
{
// record the current request
_requestedIds.Add(id);
_requestedIdsLock.ExitWriteLock();
_requestedIdsLock.ExitUpgradeableReadLock();
// get the data from the external resource
#region fake implementation - replace with real code
var item = _externalDataStoreProxy[id];
Thread.Sleep(10000);
#endregion
lock (_cache)
{
_cache.Add(id, item);
Monitor.PulseAll(_cache);
}
}
finally
{
// let go of any held locks
if (_requestedIdsLock.IsWriteLockHeld)
_requestedIdsLock.ExitWriteLock();
}
}
}
finally
{
// let go of any held locks
if (_requestedIdsLock.IsUpgradeableReadLockHeld)
_requestedIdsLock.ExitReadLock();
}
}
return _cache[id];
}
public Collection- Get(Collection
ids)
{
var notInCache = ids.Except(_cache.Keys);
// if some items don't exist in cache
if (notInCache.Count() > 0)
{
_requestedIdsLock.EnterUpgradeableReadLock();
try
{
var needToGet = notInCache.Except(_requestedIds);
// if any items have not yet been requested by other threads
if (needToGet.Count() > 0)
{
_requestedIdsLock.EnterWriteLock();
try
{
// record the current request
foreach (var id in ids)
_requestedIds.Add(id);
_requestedIdsLock.ExitWriteLock();
_requestedIdsLock.ExitUpgradeableReadLock();
// get the data from the external resource
#region fake implementation - replace with real code
var data = new Collection- ();
foreach (var id in needToGet)
{
var item = _externalDataStoreProxy[id];
data.Add(item);
}
Thread.Sleep(10000);
#endregion
lock (_cache)
{
foreach (var item in data)
_cache.Add(item.ID, item);
Monitor.PulseAll(_cache);
}
}
finally
{
// let go of any held locks
if (_requestedIdsLock.IsWriteLockHeld)
_requestedIdsLock.ExitWriteLock();
}
}
if (requestedIdsLock.IsUpgradeableReadLockHeld)
_requestedIdsLock.ExitUpgradeableReadLock();
var waitingFor = notInCache.Except(needToGet);
// if any remaining items were already requested by other threads
if (waitingFor.Count() > 0)
{
lock (_cache)
{
while (waitingFor.Count() > 0)
{
Monitor.Wait(_cache);
waitingFor = waitingFor.Except(_cache.Keys);
}
// once we get here, _cache has all our items
}
}
}
finally
{
// let go of any held locks
if (_requestedIdsLock.IsUpgradeableReadLockHeld)
_requestedIdsLock.ExitReadLock();
}
}
return new Collection
- (ids.Select(id => _cache[id]).ToList());
}
}
Update 2:
I misunderstood the behavior of UpgradeableReadLock... only one thread at a time can hold an UpgradeableReadLock. So the above should be refactored to only grab Read locks initially, and to completely relinquish them and acquire a full-fledged Write lock when adding items to _requestedIds.