Node.js/Express and parallel queues

喜欢而已 提交于 2019-12-04 09:08:08

问题


We are building an infrastructure which features a Node.js server and Express.

In the server, what is happening is as follow:

  1. The server accepts an incoming HTTP request from client.
  2. Server generates two files (this operation can be "relatively long", meaning also 0.1 seconds or so)
  3. Server uploads the generated files (~20-200 KB each) to an external CDN
  4. Server responds to client, and this includes the URI of the file on the CDN

Currently the server is doing this sequentially for each request, and this works quite well (Node/Express can handle concurrent requests automatically). However, as we plan to grow, the number of concurrent requests may grow higher, and we believe it would be better for us to implement a queue for processing requests. Otherwise, we may risk having too many tasks running at the same time and too many open connections to the CDN. Responding to the client quickly is not a relevant thing.

What I was thinking about is to have a separate part in the Node server that contains a few "workers" (2-3, but we will do tests to determine the correct number of simultaneous operations). So, the new flow would look something like:

  1. After accepting the request from the client, the server adds an operation to a queue.
  2. There are 2-3 (to be tested) workers that take elements out of the queue and perform all the operations (generate the files and upload them to the CDN).
  3. When the worker has processed the operation (doesn't matter if it stays in the queue for a relatively long time), it notifies the Node server (a callback), and the server responds to the client (which has been waiting in the meanwhile).

What do you think of this approach? Do you believe it is the correct one?

Mostly important, HOW could this be implemented in Node/Express?

Thank you for your time


回答1:


tldr; You can use the native Node.js cluster module to handle a lot of concurrent requests.

Some preamble: Node.js per se is single threaded. Its Event Loop is what makes it excellent for handling multiple requests simultaneosly even in its single thread model is, which is one of its best features IMO.

The real deal: So, how can we scale this to even handle more concurrent conections and use all CPUs available? With the cluster module.

This module will work exactly as pointed by @Qualcuno, which will allows you to create multiple workers (aka process) behind the master to share the load and use more efficiently the CPUs availables.

According with Node.js official documentation:

Because workers are all separate processes, they can be killed or re-spawned depending on your program's needs, without affecting other workers. As long as there are some workers still alive, the server will continue to accept connections.

The required example:

var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;

if (cluster.isMaster) {
  // Fork workers.
  for (var i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on('exit', function(worker, code, signal) {
    console.log('worker ' + worker.process.pid + ' died');
  });
} else {
  // Workers can share any TCP connection
  // In this case its a HTTP server
  http.createServer(function(req, res) {
    res.writeHead(200);
    res.end("hello world\n");
  }).listen(8000);
}

Hope this is what you need.

Comment if you have any further questions.




回答2:


(Answering my own question)

According to this question on Stack Overflow a solution in my case would be to implement a queue using Caolan McMahon's async module.

The main application will create jobs and push them into a queue, which has a limit on the number of concurrent jobs that can run. This allows processing tasks concurrently but with a strict control on the limit. It works like Cocoa's NSOperationQueue on Mac OSX.




回答3:


To do this, i would use a structure like the one Heroku provides with Web/Worker Dynos (servers). The web servers can accept the requests and pass the info on to the workers, who can do the information processing and uploading. I would have the front-end site listen on a socket (socket.io) for the url of the external CDN which will be fired from the worker when the upload is finished. Hopefully that makes sense.




回答4:


You can use Kue module with Redis(database to hold the jobs) Backing the queue. you create jobs and place them in a using kue module and you can put how many ever workers to work on them. useful links : kue - https://github.com/Automattic/kue



来源:https://stackoverflow.com/questions/22107144/node-js-express-and-parallel-queues

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!