I understand that with threadless async there are more threads available to service inputs (e.g. a HTTP request), but I don\'t understand how that doesn\'t potentially cause cau
Suppose you have web application which handles a request with a very common flow:
IO in this case can be database query, socket read\write, file read\write and so on.
For an example of IO let's take file reading and some arbitrary but realistic timings:
Now suppose 100 requests come in with interval of 1ms. How many threads you will need to handle those requests without delay with synchronous processing like this?
public IActionResult GetSomeFile(RequestParameters p) {
string filePath = Preprocess(p);
var data = System.IO.File.ReadAllBytes(filePath);
return PostProcess(data);
}
Well, 100 threads obviously. Since file read takes 300ms in our example, when 100th request comes in - previous 99 are busy blocked by file reading.
Now let's "use async await":
public async Task<IActionResult> GetSomeFileAsync(RequestParameters p) {
string filePath = Preprocess(p);
byte[] data;
using (var fs = System.IO.File.OpenRead(filePath)) {
data = new byte[fs.Length];
await fs.ReadAsync(data, 0, data.Length);
}
return PostProcess(data);
}
How many threads are needed now to handle 100 requests without delay? Still 100. That's because file can be opened in "synchornous" and "asynchronous" modes, and by default it opens in "synchronous". That means even though you are using ReadAsync
- underlying IO is not asynchronous and some thread from a thread pool is blocked waiting for result. Did we achieve anything useful by doing that? In context of web applicaiton - not at all.
Now let's open file in "asynchronous" mode:
public async Task<IActionResult> GetSomeFileReallyAsync(RequestParameters p) {
string filePath = Preprocess(p);
byte[] data;
using (var fs = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.Read, 4096, FileOptions.Asynchronous)) {
data = new byte[fs.Length];
await fs.ReadAsync(data, 0, data.Length);
}
return PostProcess(data);
}
How many threads we need now? Now 1 thread is enough, in theory. When you open file in "asynchronous" mode - reads and writes will utilize (on windows) windows overlapped IO.
In simplified terms it works like this: there is a queue-like object (IO completion port) where OS can post notifications about completions of certain IO operations. .NET thread pool registers one such IO completion port. There is only one thread pool per .NET application, so there is one IO completion port.
When file is opened in "asynchronous" mode - it binds its file handle to this IO completion port. Now when you do ReadAsync
, while actual read is performed - no dedicated (for this specific operation) thread is blocked waiting for that read to complete. When OS notify .NET completion port that IO for this file handle has completed - .NET thread pool executes continuation on thread pool thread.
Now let's see how processing of 100 requests with 1ms interval can go in our scenario:
Request 1 goes in, we grab thread from a pool to execute 1ms pre-processing step. Then thread performs asynchronous read. It doesn't need to block waiting for completion, so it returns to the pool.
Request 2 goes in. We have a thread in a pool already which just completed pre-processing of request 1. We don't need an additional thread - we can use that one again.
Same is true for all 100 requests.
After handling pre-processing of 100 requests, there are 200ms until first IO completion will arrive, in which our 1 thread can do even more useful work.
IO completion events start to arrive - but our post-processing step is also very short (1ms). Only one thread again can handle them all.
This is an idealized scenario of course, but it shows how not "async await" but specifically asynchronous IO can help you to "save threads".
What if our post-processing step is not short but we instead decided to do heavy CPU bound work in it? Well, that will cause thread pool starvation. Thread pool will create new threads without delay, until it reaches configurable "low watermark" (which you can obtain via ThreadPool.GetMinThreads()
and change via ThreadPool.SetMinThreads()
). After that amount of threads is reached - thread pool will try to wait for one of the busy threads to become free. It will not wait forever of course, usually it will wait for 0.5-1 seconds and if no thread become free - it will create a new one. Still, that delay might slow your web application quite a bit in heavy load scenarios. So don't violate thread pool assumptions - don't run long CPU-bound work on thread pool threads.
The fact is, the notion that async/await "saves threads" is a mixture of truth and bullshit. It is true that it doesn't generally involve creating more threads just to service a particular task, but it happily glosses over the fact that under the covers there are a number of threads waiting for events on completion ports, which are created by the runtime. The number of completion port threads is about the number of processor cores in the system. So, on a system with eight processor cores, there are around eight threads waiting for IO completion events. In an application that goes nuts with async IO, that's great, but in an application that doesn't do so much IO, they're mostly just sitting there eating resources, not "saving threads" by any stretch of the imagination.
When an async IO operation completes, one of those threads will "wake up" and eventually call the continuation on whatever task is relevant. If all of the completion threads are busy executing continuations (perhaps because the developer has made the mistake of doing a lot of CPU intensive work in continuations) when another IO operation completes, that completion will not be handled until one of the completion threads is freed up and is able handle it. This is what is referred to as "thread starvation", and it is why it is recommended to start more threads than the number of processor cores in applications that make heavy use of async IO.
The problem with .NET and async IO and the blanket notion that async IO "saves threads", is that many developers don't understand what's actually happening under the covers, and mis-using the async/await pattern in ways that can starve the completion thread pool is all too easy.
In any case, "threadless" is not a term that makes any sense whatsoever here.