Scaling a decoupled realtime server alongside a standard webserver

隐身守侯 提交于 2019-12-06 04:48:18

If the scenario is

a) The main web server raises a message upon an action (let's say a record is inserted) b ) He notifies the appropriate real-time server

you could decouple these two steps by using an intermediate pub/sub architecture that forwards the messages to the indended recipient.

An implementation would be

1) You have a redis pub-sub channel where upon a client connecting to a real-time socket, you start listening in that channel

2) When the main app wants to notify a user via the real-time server, it pushes to the channel a message, the real-time server get's it and forwards it to the intended user.

This way, you decouple the realtime notification from the main app and you don't have to keep track of where the user is.

The problem you are describing is the common "message backplane" used for example in SignalR, also related to the "fanout message exchange" in message architectures. When having a backplane or doing fanout, every message is forwarded to every message node server, so clients can connect to any server and get the message. This approach is a reasonable pain when you have to support both long polling and websockets. However, as you noticed, it is a waste of traffic and resources.

You need to use a message infrastructure with intelligent routing, like RabbitMQ. Take a look to topic and header exchange : https://www.rabbitmq.com/tutorials/amqp-concepts.html

How Topic Exchanges Route Messages

RabbitMQ for Windows: Exchange Types

There are tons of different queuing frameworks. Pick the one you like, but ensure you can have more exchange modes than just direct or fanout ;) At the end, a WebSocket is just and endpoint to connect to a message infrastructure. So if you want to scale out, it boils down to the backend you have :)

For just a few realtime servers, you could conceivably just keep a list of them in the main server and just go through them round-robin.

Another approach is to use a load balancer.

Basically, you'll have one dedicated node to receive the requests from the main server, and then have that load-balancer node take care of choosing which websocket/realtime server to forward the request to.

Of course, this just shifts the code complexity from the main server to a new component, but conceptually I think it's better and more decoupled.

Changed the answer because a reply indicated that the "main" and "realtime" servers are alraady load-balanced clusters and not individual hosts.

The central scalability question seems to be:

My general workflow is when something occurs on the main server that triggers the need for a realtime message, the main server sends that message to the realtime server (via a message queue) and the realtime server distributes it to any related connection.

Emphasis on the word "related". Assume you have 10 "main" servers and 50 "realtime" servers, and an event occurs on main server #5: which of the websockets would be considered related to this event?

Worst case is that any event on any "main" server would need to propagate to all websockets. That's a O(N^2) complexity, which counts as a severe scalability impairment.

This O(N^2) complexity can only be prevented if you can group the related connections in groups that don't grow with the cluster size or total nr. of connections. Grouping requires state memory to store to which group(s) does a connection belong.

Remember that there's 3 ways to store state:

  1. global memory (memcached / redis / DB, ...)
  2. sticky routing (load balancer configuration)
  3. client memory (cookies, browser local storage, link/redirect URLs)

Where option 3 counts as the most scalable one because it omits a central state storage.

For passing the messages from "main" to the "realtime" servers, that traffic should by definition be much smaller than the traffic towards the clients. There's also efficient frameworks to push pub/sub traffic.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!