问题
I've written a tcp
client using python and twisted, it connects to a server and communicate in a simple string based protocol (Defined by the server manufacturer). The TCP/IP connection should persist, and reconnect in case of failure.
When some sort of network error occurs (I assume on the server side or on some node along the way), it takes a very long time for the client to realize that and initiate a new connection, much more than a few minutes.
Is there a way to speed that up? Some sort of built in TCP/IP keep alive functionality that can detect the disconnect sooner?
I can implement a keep alive mechanism myself, and look for timeouts, not sure that's the best practice in this case. What do you think? Also, when using reactor.connectTCP()
and reactor.run()
with a ClientFactory
, what's the best way to force a re-connection?
回答1:
Application level keep-alives for TCP-based protocols are a good idea. You should probably implement this. This gives you complete and precise control over the timeout semantics you want from your application.
TCP itself has a keepalive mechanism. You can enable this with an ITCPTransport
method call from your protocol. For example:
class YourProtocol(Protocol):
def connectionMade(self):
self.transport.setTcpKeepAlive(True)
The exact semantics of this keepalive are platform and configuration dependent. It's entirely possible this is already enabled and is what's detecting your connection lose. Thirty minutes is a pretty plausible amount of time for this mechanism to notice a lost connection.
回答2:
As stated in by Jean-Paul Calderone, you can either implement an application level keepalive or use the TCP keepalive mechanism. The application level keepalive is the preferred method as it gives you more fine-grained control.
The TCP keepalive mechanism lives on the OS level and the defaults are OS dependant, but are configurable. For example the default linux TCP keepalive works in the following way:
- After 2 hours send a keepalive probe.
- If this fails, send another probe every 75 seconds.
- After 9 consecutive fails, mark the connection as closed. This will be picked up by the server and it will trigger whatever cleanup mechanisms it has in place.
See: https://en.wikipedia.org/wiki/Keepalive#TCP_keepalive and http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html
So while the TCP keepalive will eventually reap your dead connections, it will take quite a long time to kick in.
来源:https://stackoverflow.com/questions/49459330/twisted-detection-of-lost-connection-takes-more-than-30-minutes