NodeJS Event Loop Fundamendals

问题

I'm sure it's a commonly asked question but didn't find a concrete answer.

I kind of understand the basic concept of NodeJS and it's asynchronous/non-blocking nature of processing I/O.

For argument sake, let's take a simple example of a HTTP server written in node that executes the unix command 'find /' and writes the result to the http response (therefore displaying the result of the command on the user's browser). Let's assume that this takes 3 seconds.

Let's assume that there are two users 'A' and 'B' requesting through their browsers exactly at the same time.

As I understand the user's requests are queued in the event queue (Message A, Message B). The message also has a reference to it's associated callback to be executed once the processing is done.

Since, the event loop is single threaded and processes the events one by one,

In my above example, Will it take 6 seconds for the Callback of "User B" to get triggered? [3 for "User A"s event processing and 3 for it's own event processing]

This sounds like I'm missing something here?

The worst is if 100 users are requesting at the same millisecond? The 100th event owner is going to be the most unfortunate user and has to wait for eternity.

As I understand, there is only one event queue in the runtime, the above problem can applicable to any user in any part of the application. For example, a slow Database Query in web page X would slow down the a different user in web page Y?

Fundamentally, I see a problem in serial processing of events and serial execution of their associated callbacks.

Am I missing something here?

回答1:

A properly written node.js server will use async I/O and communication for any networking, disk I/O, timers or communication with other processes. When written this way, multiple http requests can be worked on in parallel. Though the node.js code that processes any given request is only run one at a time, anytime one request is waiting for I/O (which is typically much of the time of a request), then other requests can run.

The end result is that all requests appear to progress at the same time (though in reality, the work on them is interwoven). The Javascript event queue is the mechanism for serializing the work among all the various requests. Whenever an async operation finishes it's work or wishes to notify the main JS thread of some event, it puts something in the event queue. When the current thread of JS execution finishes (even if it has its own async operations in progress), the JS engine looks in the event queue and then executes the next item in that queue (usually some form of a callback) and, in that way, the next queued operation proceeds.

In your specific example, when you fire up another process and then asynchronously wait for its result, the current thread of execution finishes and then the next item in the event queue gets to run. If that next item is another http request, then that request starts processing. When this second request, then hits some async point, it's thread of execution finishes and again the next item in the event queue runs. In this way, new http requests get started and asynchronous callbacks from async operations that have finished get to run. Things happen in roughly a FIFO (first-in, first-out) order for how they are put in the event queue. I say "roughly" because there are actually different types of events and not all are serialized equally, but for the purpose of this discussion that implementation detail can be ignored.

So, if three http requests arrive at the exact same time, then one will run until it hits an async point. Then, the next will run until it hits an async point. Then, the third will run until it hits an async point. Then, whichever request finishes its first async operation will get a callback from that async operation and it will run until it is done or hits another async point. And, so on...

Since much of what usually causes a web server to take much time to respond is usually some sort of I/O operation (disk or networking) which can all be programmed asynchronously in node.js, this whole process generally works quite well and its actually a lot more efficient with server resources than using a separate thread per request. The one time that it doesn't work very well is if there's a heavy compute-intensive or some long running, but not asynchronous operation that ties up the main node.js thread for long periods of time. Because the node.js system is a cooperative CPU sharing system, if you have a long running operation that ties up the main node.js thread, it will hog the system (there is no pre-emptive sharing at all with other operations like there could be with a mutli-threaded system). Hogging the system makes all other requests wait until the first one is done. The node.js answer to some CPU hogging computation would be to move that one operation to another process and communicate asynchronously with that other process from the node.js thread - thus preserving the async model for the single node.js thread.

For node.js database operations, the database will generally provide an async interface for node.js programming to use the database in an async fashion and then it is up to the implementation of the database interface to actually implement the interface in an async fashion. This will likely be done by communicating with some other process where the actual database logic is implemented (probably communicating via TCP). That actual database logic may use actual threads or not - that's an implementation detail that is up to the database itself. What is important to node.js is that the computation and database work is out of the node.js thread in some other process, perhaps even on another host so it does not block the node.js thread.

来源：https://stackoverflow.com/questions/30553755/nodejs-event-loop-fundamendals

标签

node.js

javascript-events

event-loop