I was watching The zen of async: Best practices for best performance and Stephen Toub started to talk about Task caching, where instead of caching the results of task jobs y
Let's assume you are talking to a remote service which takes the name of a city and returns its zip codes. The service is remote and under load so we are talking to a method with an asynchronous signature:
interface IZipCodeService
{
Task> GetZipCodesAsync(string cityName);
}
Since the service needs a while for every request we would like to implement a local cache for it. Naturally the cache will also have an asynchronous signature maybe even implementing the same interface (see Facade pattern). A synchronous signature would break the best-practice of never calling asynchronous code synchronously with .Wait(), .Result or similar. At least the cache should leave that up to the caller.
So let's do a first iteration on this:
class ZipCodeCache : IZipCodeService
{
private readonly IZipCodeService realService;
private readonly ConcurrentDictionary> zipCache = new ConcurrentDictionary>();
public ZipCodeCache(IZipCodeService realService)
{
this.realService = realService;
}
public Task> GetZipCodesAsync(string cityName)
{
ICollection zipCodes;
if (zipCache.TryGetValue(cityName, out zipCodes))
{
// Already in cache. Returning cached value
return Task.FromResult(zipCodes);
}
return this.realService.GetZipCodesAsync(cityName).ContinueWith((task) =>
{
this.zipCache.TryAdd(cityName, task.Result);
return task.Result;
});
}
}
As you can see the cache does not cache Task objects but the returned values of ZipCode collections. But by doing so it has to construct a Task for every cache hit by calling Task.FromResult and I think that is exactly what Stephen Toub tries to avoid. A Task object comes with overhead especially for the garbage collector because you are not only creating garbage but also every Task has a Finalizer which needs to be considered by the runtime.
The only option to work around this is by caching the whole Task object:
class ZipCodeCache2 : IZipCodeService
{
private readonly IZipCodeService realService;
private readonly ConcurrentDictionary>> zipCache = new ConcurrentDictionary>>();
public ZipCodeCache2(IZipCodeService realService)
{
this.realService = realService;
}
public Task> GetZipCodesAsync(string cityName)
{
Task> zipCodes;
if (zipCache.TryGetValue(cityName, out zipCodes))
{
return zipCodes;
}
return this.realService.GetZipCodesAsync(cityName).ContinueWith((task) =>
{
this.zipCache.TryAdd(cityName, task);
return task.Result;
});
}
}
As you can see the creation of Tasks by calling Task.FromResult is gone. Furthermore it is not possible to avoid this Task creation when using the async/await keywords because internally they will create a Task to return no matter what your code has cached. Something like:
public async Task> GetZipCodesAsync(string cityName)
{
Task> zipCodes;
if (zipCache.TryGetValue(cityName, out zipCodes))
{
return zipCodes;
}
will not compile.
Don't get confused by Stephen Toub's ContinueWith flags TaskContinuationOptions.OnlyOnRanToCompletion and TaskContinuationOptions.ExecuteSynchronously. They are (only) another performance optimization which is not related to the main objective of caching Tasks.
As with every cache you should consider some mechanism which clean the cache from time to time and remove entries which are too old or invalid. You could also implement a policy which limits the cache to n entries and trys to cache the items requested most by introducing some counting.
I did some benchmarking with and without caching of Tasks. You can find the code here http://pastebin.com/SEr2838A and the results look like this on my machine (w/ .NET4.6)
Caching ZipCodes: 00:00:04.6653104
Gen0: 3560 Gen1: 0 Gen2: 0
Caching Tasks: 00:00:03.9452951
Gen0: 1017 Gen1: 0 Gen2: 0