How to correctly use TPL with TcpClient?

那年仲夏 提交于 2020-06-27 15:49:12

问题


I wrote a server using TcpListener that is supposed to handle many thousands of concurrent connections.

Since I know that most of the time most connections will be idle (with the occasional ping-pong to make sure the other side is still there) async programming seemed to be the solution.

However after the first few hundred clients performance rapidly deteriorates. So rapidly in fact that I can barely reach 1000 concurrent connections.

The CPU is not maxed out (averaging at ~4%), RAM usage is <100MB, and there's not a lot of network traffic going on.

When I pause the server in Visual Studio and take a look at the 'Tasks' window, there are countless (hundreds) tasks with status "scheduled" and only few (less than 30) "running/active" tasks.

I tried to profile using Visual Studio as well as dotTrace Peformacne, but I couldn't find anything wrong. No lock contention, no "hot path" where a lot of CPU is used. It seems like the application just slows down overall.

The setup

I have a simple while(true) and inside it there's this:

var client = await tcpListener.AcceptTcpClientAsync().ConfigureAwait(false);
Task.Run(() => OnClient(client));

In order to handle the connection I made a few methods to encapsulate the different stages of the connection. For example inside the OnClient above there's await HandleLogin(...), and then it enters a while(client.IsConnected) loop that just does await stream.ReadBuffer(1). stream is just the normal NetworkStream that you get from TcpClient.GetStream, and ReadBuffer is a custom method implemented like this:

public static async Task<byte[]> ReadBuffer(this Stream stream, int length)
{
    byte[] buffer = new byte[length];
    int read = 0;

    while (read < length)
    {
        int remaining = length - read;

        int readNow = await stream.ReadAsync(buffer, read, remaining).ConfigureAwait(false);
        read += readNow;

        if (readNow <= 0)
            throw new SocketException((int)SocketError.ConnectionReset);
    }

    return buffer;
}

I use .ConfigureAwait(false) at every single place where I await anything because I have need for any sort of synchronization context, and I don't want to pay the performance overhead of retreiving/creating a synchronization context everywhere.

One thing I noticed is that when I spawn 50 connections from my test-tool and then randomly just close it (so all connections it made should receive a ConnectionReset SocketException on the server) it takes a long time for the server to react at all oftentimes hanging completely until a new connection arrives.

Could it be that somehow some continuations want to synchronize and run on some specific thread somehow? It's possible (when disconnecting at the right moment) to make the server application almost unusable with as few as 20 connections.

What am I doing wrong? If it is some bug (which I assume it is), how would I go about finding it? I narrowed the problem down to many Tasks just sitting at NetworkStream.ReadAsync(...) even though they should instantly receive a SocketException (ConnectionReset).

I tried starting my test tool (which is just using TcpClient) on a remote machine as well as locally and I get the same results.

Edit 1

My OnClient is defined as async Task OnClient(TcpClient client). Inside it, it awaits the different stages of the connection: authentication, some settings negotiation, then entering a loop where it waits for messages.

I use Task.Run because I do not want to wait until one client is done, but I want to accept all clients as fast as possible, spawning a new Task for each one. I am however unsure if I couldn't/shouldn't just write OnClient(client) without the Task.Run around it and also without awaiting OnClient (would result in a hint that doesn't go away but it is what I want I think, I don't want to wait until the client is done).

The last stage

The last stage the connection enters after authentication and settings negotaion is a loop where the server waits for messages from the client. However before that the server also does another Task.Run() (with while(is connected) and await Task.Delay...) to send ping packets and a few other "management" things. All writes into the NetworkStream are synchronized by using the lock mechanism from the Nito AsyncEx library to make sure no packets are somehow interleaved. If any exception happen anywhere (when reading or writing) I always call .Close on the TcpClient to make sure all other pending incomplete reads and writes throw an exception.


回答1:


I narrowed the problem down to many Tasks just sitting at NetworkStream.ReadAsync(...) even though they should instantly receive a SocketException (ConnectionReset).

This is an incorrect assumption. You have to write to the socket to detect dropped connections.

This is one of many pitfalls of TCP/IP programming, which is why I recommend people use SignalR if at all possible.

Other pitfalls that jump out from your code/description:

  • You're attempting to use asynchronous APIs, but your code also has Task.Run. So it's still doing a thread jump right away. This may be desirable or it may not. (Assuming OnClient is an async method; if it's using sync-over-async, then it's definitely not a good pattern).
  • while(client.IsConnected) is a common incorrect pattern. You should have both a read loop and write queue processor running simultaneously. In particular, IsConnected is absolutely meaningless - it literally only means that the socket was connected at some point in the past. It does not mean that it is still connected. If code has IsConnected, then there's a bug.


来源:https://stackoverflow.com/questions/43301378/how-to-correctly-use-tpl-with-tcpclient

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!