I did some search on the question, but it seems like people only emphasize on Non-blocking IO.
Let\'s say if I just have a very simple application to respond \"Hello
The operating system gives each socket connection a send and receive queue. That is where the bytes sit until something at the application layer handles them. If the receive queue fills up no connected client can send information until there is space available in the queue. This is why an application should handle requests as fast as possible.
If you are on a *nix system you can use netstat to view the current number of bytes in the send and receive queues. In this example, there are 0 bytes in the receive queue and 240 bytes in the send queue (waiting to be sent out by the OS).
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 240 x.x.x.x:22 x.x.x.x:* LISTEN
On Linux you can check the default size and max allowed size of the send/receive queues with the proc file system:
Receive:
cat /proc/sys/net/core/rmem_default
cat /proc/sys/net/core/rmem_max
Send:
cat /proc/sys/net/core/wmem_max
cat /proc/sys/net/core/wmem_default