Do I need to synchronize resource access between Tasks with the default TaskScheduler?

允我心安 提交于 2021-02-05 07:58:06

问题


I'm programming with Tasks and await/async. I assumed that the multithreading works like it does in NodeJS or Python, that is, it doesn't, everything just runs on the same thread. But I've been trying to learn how Tasks actually get executed and my understanding is that they're executed by TaskScheduler.Default who's implementation is hidden but can be expected to use a ThreadPool.

Should I be programming as if all my Tasks can run in any thread?

The extent of my asynchronous programming is fairly lightweight CPU work consisting of several infinite loops that do work and then await on Task.Delay for several seconds. Right now the only shared resources is an int that increments every time I write a network message but in the future I expect my tasks will be sharing Dictionaries and Lists.

I also have a network Task that connects to a TCP server and reads messages using a Task I implemented on BeginRead+EndRead. The Read function is called by an infinite loop that reads a messages, processes it, then reads a new message.

        void OnRead(IAsyncResult result)
        {
            var pair = (Tuple<TaskCompletionSource<int>, NetworkStream>)result.AsyncState;
            int count = pair.Item2.EndRead(result);
            pair.Item1.SetResult(count);
        }

        async Task<byte[]> Read(NetworkStream stream, uint size)
        {
            var result = new byte[size];
            var count = 0;
            while(count < size)
            {
                var tcs = new TaskCompletionSource<int>();
                stream.BeginRead(result, count, result.Length - (int)count, new AsyncCallback(OnRead), Tuple.Create(tcs, stream));
                count += await tcs.Task;
            }
            return result;
        }

I write to the NetworkStream using synchronous writes.


回答1:


I assumed that the multithreading works like it does in NodeJS or Python, that is, it doesn't, everything just runs on the same thread. But I've been trying to learn how Tasks actually get executed and my understanding is that they're executed by TaskScheduler.Default who's implementation is hidden but can be expected to use a ThreadPool.

Not exactly.

First, Task in .NET can be two completely different things. Delegate Tasks represent code that can run on some thread, using a TaskScheduler to determine where and how they run. Delegate Tasks were introduced with the original Task Parallel Library and are almost never used with asynchronous code. The other kind of Task is Promise Tasks. These are much more similar to Promise in JavaScript: they can represent anything - they're just an object that is either "not finished yet" or "finished with a result" or "finished with an error". Here's a contrast of the different state diagrams for the different kinds of tasks.

So, the first thing to recognize is that just like you don't "execute a Promise" in JavaScript, you don't "execute a (Promise) Task" in .NET. So asking what thread it runs on doesn't make sense, since they don't run anywhere.

However, both JS and C# have an async/await language construct that allows you to write more natural code to control promises. When the async method completes, the promise is completed; if the async method throws, the promise is faulted.

So the question then becomes: where does the code run that controls this promise?

In the JavaScript world, the answer is obvious: there is only one thread, so that is where the code runs. In the .NET world, the answer is a bit more complex. My async intro gives the core concepts: every async method begins executing synchronously, on the calling thread, just like any other method. When it yields due to an await, it will capture its "context". Then, when that async method is ready to resume after the await, it resumes within that "context".

The "context" is SynchronizationContext.Current, unless it is null, in which case the context is TaskScheduler.Current. In modern code, the "context" is usually either a GUI thread context (which always resumes on the GUI thread), or the thread pool context (which resumes on any available thread pool thread).

Should I be programming as if all my Tasks can run in any thread?

The code in your async methods can resume on a thread pool thread if it's called without a context.

Do I need to synchronize resource access between Tasks

Probably not. The async and await keywords are designed to allow easy writing of serial code. So there's no need to synchronize code before an await with code after an await; the code after the await will always run after the code before the await, even if it runs on a different thread. Also, await injects all necessary thread barriers, so there's no issues around out-of-order reads or anything like that.

However, if your code runs multiple async methods at the same time, and those methods share data, then that would need to be synchronized. I have a blog post that covers this kind of accidental implicit parallelism (at the end of the post). Generally speaking, asynchronous code encourages returning results rather than applying side effects, and as long as you do that, implicit parallelism is less of a problem.



来源:https://stackoverflow.com/questions/59491873/do-i-need-to-synchronize-resource-access-between-tasks-with-the-default-tasksche

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!