Netty.writeAndFlush with future is successfull to killed host

拜拜、爱过 提交于 2020-01-06 04:04:27

问题


We have a Netty (4.0.15) based Websocket server running on Ubuntu v10, and during resiliency testing we do:

  1. kill -9 server
  2. send some data from client
  3. expect writeAndFlush failure on client

For some reasons sometimes we see:

  1. writeAndFlush success and then after
  2. java.io.IOException: Connection reset by peer

So is it possible the writeAndFlush sometimes completes successfully even if the server is gone, whilst other times it fails?

Maybe this occurs because of the schedule of the OS socket clean-up mechanism for killed processes?

Client test code:

    channel.writeAndFlush(new TextWebSocketFrame("blah blah")).addListeners(
    <snip>
            public void operationComplete(ChannelFuture future) {
                assert future.isSuccess() == false;  <-- sometimes this is not triggered
            }
    </snip>

Thanks for any ideas,


回答1:


It's a simple race condition, and something that you have to accept can happen. You can only determine that the remote host has disappeared by not receiving data from it. Generally this is achieved by setting a timer and assuming that if data hasn't been received (possibly in response to a keep alive message) the remote host is dead.

Essentially TCP assumes that the remote host is dead if it attempts to retransmit some data a certain number of times without receiving an acknowledgement, or it does not receive a response to keep alive (which is usually off by default). However, assuming there is room in your host's send buffer, you can continue to call writeAndFlush successfully as it will simply be queued in the network buffers. WriteAndFlush is considered to have succeeded once Netty has written the data to the kernel send buffer. There is no way of determining whether the data reached the remote host without an application level acknowledgement. Thus you may be calling writeAndFlush while TCP is in the process of determining that the remote host has died and so writeAndFlush succeeds but the data is not sent. Alternatively you may call writeAndFlush at the same time as TCP determines the remote host is dead and therefore raises an error.

There's a lot more information on TCP retransmission and keep alive here and here



来源:https://stackoverflow.com/questions/21677864/netty-writeandflush-with-future-is-successfull-to-killed-host

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!