The Problem
An app I'm maintaining keeps getting socket timeouts after approximately 21000 ms, despite the fact that I've explicitly set longer timeouts. This seemingly magical value of 21000 ms has come up in a few other SO questions and answers, and I'm trying to figure out exactly where it comes from.
Here's the essence of my code:
HttpURLConnection connection = null;
try {
URL url = new URL(urlString);
connection = (HttpURLConnection) url.openConnection();
connection.setConnectTimeout(45000);
connection.setReadTimeout(90000);
int responseCode = connection.getResponseCode();
if (responseCode == 200) {
// code omitted
}
} catch (Exception e) {
// code omitted
}
finally {
if (connection != null) {
connection.disconnect();
}
}
Catching all exceptions in one block is admittedly not ideal, but it's inherited code and I'm reluctant to mess with it. I know it's catching SocketTimeoutException
after 21000 ms because it logs the simple name of the exception class.
Clues
I found a question where an asker was getting a ConnectTimeout
after 21000 ms, despite explicitly setting it to 40000 ms. That's intriguing despite the exception class being different.
I also found a poorly-explained answer which claims that the server side is responsible for the 21000 ms timeout.
My Hunch
I don't think any action or inaction of the server could cause a shorter-than-expected socket timeout on the client. But maybe the TCP stacks in Windows and Android share a common ancestor, or at least use similar connect retry logic.
Could it be that Android imposes a maximum connect timeout of 21000 ms, and setting a longer timeout in HttpURLConnection
is futile? Or could this timeout be triggered by some Windows machine on the path between the mobile device and the server? Do some Android versions throw a SocketTimeoutException
where others throw a ConnectException
?
According to RFC 1122 (TRANSPORT LAYER -- TCP), section 4.2.3.1 ("Retransmission Timeout Calculation"):
"Implementation also MUST include exponential backoff for successive RTO values for the same segment".
So xpa1492's answer sounds plausible (despite its Windows-specific nature); the implementation of a TCP stack either follows this RFC or gets panned for failing to do so.
By the way, RFC 1122 specifies 3 seconds as the initial timeout, explicitly, making xpa1492's (3 + 6 + 12 = 21) answer sound like the answer to your mystery.
And yes, the Android TCP stack shares a common ancestor with Windows TCP stack; they were both created using RFC 1122 as a guide ("[The Linux TCP stack is] an implementation of the TCP protocol defined in RFC 793, RFC 1122 and RFC 2001 with the NewReno and SACK extensions").
I suspect that your problem is related to radio interference, so you might want to try enabling F-RTO, as you might be hitting the "magic number" repeatedly because of the environment in which you are testing.
It seems like it is a Windows default configuration...
Based on the link and some further reading, Windows will by default do 3 retries and double the timeout with each attempt, starting a s 3sec one. So you end up with 3sec + 6sec + 12sec = 21sec timeout.
I wrote a crude test app, based on the code in my question, that simulates a connect timeout by attempting to connect to a non-routable address as suggested in this answer. On my Moto G (Android 4.4.2), it throws a SocketTimeoutException
in approximately 45 seconds as expected. Curiously, if I do not explicitly set the connect timeout, it instead throws a ConnectException
after approximately one minute.
I'm going to write a slightly more sophisticated test app and send it to the customer to try to determine if the device itself is imposing a 21s timeout, or if some router on their mobile network might be the culprit. I'll update this answer with the results.
Result: This appears to be an OS bug that affects the Samsung SPH-P100 (Galaxy Tab 1) from Sprint. I don't have access to a Tab 1 from any other carrier, so this could be blamed on Samsung or Sprint. It does not seem to generally affect Android 2.x, because I have a ZTE X501 running 2.3.6 which allows me to set longer timeouts.
来源:https://stackoverflow.com/questions/26896414/where-does-the-socket-timeout-of-21000-ms-come-from