Throttle async tasks?

主宰稳场 提交于 2019-12-07 07:03:00

问题


I would like to know if we should throttle async tasks if the number of tasks to complete is big. Say you have 1000 URLs, do you fire all the requests at once and wait for all:

var tasks = urlList.Select(url => downloadAsync(url));
await Task.WhenAll(tasks);

Or do you batch the requests and process one batch after another:

foreach (var urlBatch in urlList.BatchEnumerable(BatchSize)){
    var tasks = urlBatch.Select(url => downloadAsync(url));
    await Task.WhenAll(tasks);
}

I thought that batching is not necessary, because the first approach (firing all requests at once) will create tasks that are scheduled by the ThreadPool, so we should let the ThreadPool decide when to execute each task. However, I was told that in practice that only works if the tasks are compute tasks. When the the tasks involve network requests, the first approach could cause the host machine to hang ??? Why is that ?


回答1:


You want to limit yourself to something in most cases. You always have some state kept somewhere when you have multiple operations running concurrently. If they are CPU bound then tasks are stored in the ThreadPool queue waiting for a thread and if it's async then you have the state machine sitting on the heap.

Even async operations usually use up some limited resource, be it bandwith, ports, remote DB server's CPU, etc.

You don't have to limit yourself to a single batch at a time though (as you need to wait for the last operation to complete instead of starting others). You can throttle using a SlimSemahpore or even better, a TPL Dataflow block:

var block = new ActionBlock<string>(
   url => downloadAsync(url),
   new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 10 });    

urlList.ForEach(url => block.Post(url));

block.Complete();
await block.Completion;


来源:https://stackoverflow.com/questions/35023685/throttle-async-tasks

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!