Best way to limit the number of active Tasks running via the Parallel Task Library

前端 未结 6 1892
春和景丽
春和景丽 2020-12-02 16:54

Consider a queue holding a lot of jobs that need processing. Limitation of queue is can only get 1 job at a time and no way of knowing how many jobs there a

6条回答
  •  抹茶落季
    2020-12-02 17:40

    The problem here doesn't seem to be too many running Tasks, it's too many scheduled Tasks. Your code will try to schedule as many Tasks as it can, no matter how fast they are executed. And if you have too many jobs, this means you will get OOM.

    Because of this, none of your proposed solutions will actually solve your problem. If it seems that simply specifying LongRunning solves your problem, then that's most likely because creating a new Thread (which is what LongRunning does) takes some time, which effectively throttles getting new jobs. So, this solution only works by accident, and will most likely lead to other problems later on.

    Regarding the solution, I mostly agree with usr: the simplest solution that works reasonably well is to create a fixed number of LongRunning tasks and have one loop that calls Queue.PopJob() (protected by a lock if that method is not thread-safe) and Execute()s the job.

    UPDATE: After some more thinking, I realized the following attempt will most likely behave terribly. Use it only if you're really sure it will work well for you.


    But the TPL tries to figure out the best degree of parallelism, even for IO-bound Tasks. So, you might try to use that to your advantage. Long Tasks won't work here, because from the point of view of TPL, it seems like no work is done and it will start new Tasks over and over. What you can do instead is to start a new Task at the end of each Task. This way, TPL will know what's going on and its algorithm may work well. Also, to let the TPL decide the degree of parallelism, at the start of a Task that is first in its line, start another line of Tasks.

    This algorithm may work well. But it's also possible that the TPL will make a bad decision regarding the degree of parallelism, I haven't actually tried anything like this.

    In code, it would look like this:

    void ProcessJobs(bool isFirst)
    {
        var job = Queue.PopJob(); // assumes PopJob() is thread-safe
        if (job == null)
            return;
    
        if (isFirst)
            Task.Factory.StartNew(() => ProcessJobs(true));
    
        job.Execute();
    
        Task.Factory.StartNew(() => ProcessJob(false));
    }
    

    And start it with

    Task.Factory.StartNew(() => ProcessJobs(true));
    

提交回复
热议问题