可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I m running an Express js application with socket.io for a chat webapp and I get the following error randomly around 5 times during 24h. The node process is wrapped in forever and it restarts itself immediately.
Problem is that restarting express kicks my users out of their rooms and nobody wants that.
The web server is proxied by HAProxy. There are no socket stability issues, just using websockets and flashsockets transports. I cannot reproduce this on purpose.
This is the error with node v0.10.11:
events.js:72 throw er; // Unhandled 'error' event ^ Error: read ECONNRESET //alternatively it s a 'write' at errnoException (net.js:900:11) at TCP.onread (net.js:555:19) error: Forever detected script exited with code: 8 error: Forever restarting script for 2 time
EDIT (2013-07-22)
Added both socket.io client error handler and the uncaught exception handler. Seems that this one catches the error:
process.on('uncaughtException', function (err) { console.error(err.stack); console.log("Node NOT Exiting..."); });
So I suspect it's not a socket.io issue but an http request to another server that I do or a mysql/redis connection. Problem is that the error stack doesn't help me identify my code issue. Here is the log output:
Error: read ECONNRESET at errnoException (net.js:900:11) at TCP.onread (net.js:555:19)
How do I know what causes this? How do I get more out of the error?
Ok, not very verbose but here s the stacktrace with "longjohn":
Exception caught: Error ECONNRESET { [Error: read ECONNRESET] code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read', __cached_trace__: [ { receiver: [Object], fun: [Function: errnoException], pos: 22930 }, { receiver: [Object], fun: [Function: onread], pos: 14545 }, {}, { receiver: [Object], fun: [Function: fireErrorCallbacks], pos: 11672 }, { receiver: [Object], fun: [Function], pos: 12329 }, { receiver: [Object], fun: [Function: onread], pos: 14536 } ], __previous__: { [Error] id: 1061835, location: 'fireErrorCallbacks (net.js:439)', __location__: 'process.nextTick', __previous__: null, __trace_count__: 1, __cached_trace__: [ [Object], [Object], [Object] ] } }
Here I serve the flash socket policy file:
net = require("net") net.createServer( (socket) => socket.write("<?xml version=\"1.0\"?>\n") socket.write("<!DOCTYPE cross-domain-policy SYSTEM \"http://www.macromedia.com/xml/dtds/cross-domain-policy.dtd\">\n") socket.write("<cross-domain-policy>\n") socket.write("<allow-access-from domain=\"*\" to-ports=\"*\"/>\n") socket.write("</cross-domain-policy>\n") socket.end() ).listen(843)
Can this be the cause?
回答1:
You might have guessed it already: it's a connection error.
"ECONNRESET" means the other side of the TCP conversation abruptly closed its end of the connection. This is most probably due to one or more application protocol errors. You could look at the API server logs to see if it complains about something.
But since you are also looking for a way to check the error and potentially debug the problem, you should take a look at "How to debug a socket hang up error in NodeJS?" which was posted at stackoverflow in relation to an alike question.
Quick and dirty solution for development:
Use longjohn, you get long stack traces that will contain the async operations.
Clean and correct solution: Technically, in node, whenever you emit an 'error'
event and no one listens to it, it will throw. To make it not throw, put a listener on it and handle it yourself. That way you can log the error with more information.
To have one listener for a group of calls you can use domains and also catch other errors on runtime. Make sure each async operation related to http(Server/Client) is in different domain context comparing to the other parts of the code, the domain will automatically listen to the error
events and will propagate it to it's own handler. So you only listen to that handler and get the error data. You also get more information for free.
EDIT (2013-07-22)
As I wrote above:
"ECONNRESET" means the other side of the TCP conversation abruptly closed its end of the connection. This is most probably due to one or more application protocol errors. You could look at the API server logs to see if it complains about something.
What could also be the case: at random times, the other side is overloaded and simply kills the connection as a result. If that's the case, depends on what you're connecting to exactly…
But one thing's for sure: you indeed have a read error on your TCP connection which causes the exception. You can see that by looking at the error code you posted in your edit, which confirms it.
回答2:
A simple tcp server I had for serving the flash policy file was causing this. I can now catch the error using a handler:
# serving the flash policy file net = require("net") net.createServer((socket) => //just added socket.on("error", (err) => console.log("Caught flash policy server socket error: ") console.log(err.stack) ) socket.write("<?xml version=\"1.0\"?>\n") socket.write("<!DOCTYPE cross-domain-policy SYSTEM \"http://www.macromedia.com/xml/dtds/cross-domain-policy.dtd\">\n") socket.write("<cross-domain-policy>\n") socket.write("<allow-access-from domain=\"*\" to-ports=\"*\"/>\n") socket.write("</cross-domain-policy>\n") socket.end() ).listen(843)
回答3:
I had a similar problem where apps started erroring out after an upgrade of Node. I believe this can be traced back to Node release v0.9.10 this item:
- net: don't suppress ECONNRESET (Ben Noordhuis)
Previous versions wouldn't error out on interruptions from the client. A break in the connection from the client throws the error ECONNRESET in Node. I believe this is intended functionality for Node, so the fix (at least for me) was to handle the error, which I believe you did in unCaught exceptions. Although I handle it in the net.socket handler.
You can demonstrate this:
Make a simple socket server and get Node v0.9.9 and v0.9.10.
require('net') .createServer( function(socket) { // no nothing }) .listen(21, function() { console.log('Socket ON') })
Start it up using v0.9.9 and then attempt to FTP to this server. I'm using FTP and port 21 only because I'm on Windows and have an FTP client, but no telnet client handy.
Then from the client side, just break the connection. (I'm just doing Ctrl-C)
You should see NO ERROR when using Node v0.9.9, and ERROR when using Node v.0.9.10 and up.
In production, I use v.0.10. something and it still gives the error. Again, I think this is intended and the solution is to handle the error in your code.
回答4:
I was facing the same issue but I mitigated it by placing:
server.timeout = 0;
before server.listen
. server
is an HTTP server here. The default timeout is 2 minutes as per the API documentation.
回答5:
Had the same problem today. After some research i found a very useful --abort-on-uncaught-exception
node.js option. Not only it provides much more verbose and useful error stack trace, but also saves core file on application crash allowing further debug.
回答6:
Yes, your serving of the policy file can definitely cause the crash.
To repeat, just add a delay to your code:
net.createServer( function(socket) { for(i=0; i<1000000000; i++); socket.write("<?xml version=\"1.0\"?>\n") …
…and use telnet
to connect to the port. If you disconnect telnet before the delay has expired, you'll get a crash (uncaught exception) when socket.write throws an error.
To avoid the crash here, just add an error handler before reading/writing the socket:
net.createServer( function(socket) { for(i=0; i<1000000000; i++); socket.on('error', function() { console.log("error"); }); socket.write("<?xml version=\"1.0\"?>\n")
When you try the above disconnect, you'll just get a log message instead of a crash.
And when you're done, remember to remove the delay.
回答7:
Another possible case (but rare) could be if you have server to server communications and have set server.maxConnections
to a very low value.
In node's core lib net.js it will call clientHandle.close()
which will also cause error ECONNRESET:
if (self.maxConnections && self._connections >= self.maxConnections) { clientHandle.close(); // causes ECONNRESET on the other end return; }
回答8:
I solved the problem by simply connecting to a different network. That is one of the possible problems.
As discussed above, ECONNRESET means that the TCP conversation abruptly closed its end of the connection.
Your internet connection might be blocking you from connecting to some servers. In my case, I was trying to connect to mLab ( cloud database service that hosts MongoDB databases). And my ISP is blocking it.
回答9:
Try adding these options to socket.io:
const options = { transports: ['websocket'], pingTimeout: 3000, pingInterval: 5000 };
I hope this will help you !