Worker pools and multi-tenant queues with RabbitMQ

前端 未结 4 1532
悲&欢浪女
悲&欢浪女 2021-02-05 14:28

I work on a web application that is a multi-tenant cloud based application (lots of clients, each with their own separate \"environment\", but all on shared sets of hardware) an

4条回答
  •  花落未央
    2021-02-05 14:44

    You could look at the priority queue implementation (which wasn't implemented when this question was originally asked): https://www.rabbitmq.com/priority.html

    If that doesn't work for you, you could try some other hacks to achieve what you want (which should work with older versions of RabbitMQ):

    You could have 100 queues bound to a topic exchange and set the routing key to a hash of the user ID % 100, i.e. each task will have a key between 1 and 100 and tasks for the same user will have the same key. Each queue is bound with a unique pattern between 1 and 100. Now you have a fleet of workers which start with a random queue number and then increment that queue number after each job, again % 100 to cycle back to queue 1 after queue 100.

    Now your worker fleet can process up to 100 unique users in parallel, or all the workers can focus on a single user if there is no other work to do. If the workers need to cycle through all 100 queues between each job, in the scenario that only a single user has lot of jobs on a single queue, you're naturally going to have some overhead between each job. A smaller number of queues is one way to deal with this. You could also have each worker hold a connection to each of the queues and consume up to one un-acknowledged message from each. The worker can then cycle through the pending messages in memory much faster, provided the un-acknowledged message timeout is set sufficiently high.

    Alternatively you could create two exchanges, each with a bound queue. All work goes to the first exchange and queue, which a pool of workers consume. If a unit of work takes too long the worker can cancel it and push it to the second queue. Workers only process the second queue when there's nothing on the first queue. You might also want a couple of workers with the opposite queue prioritization to make sure long running tasks are still processed when there's a never ending stream of short tasks arriving, so that a users batch will always be processed eventually. This won't truly distribute your worker fleet across all tasks, but it will stop long running tasks from one user holding up your workers from executing short running tasks for that same user or another. It also assumes you can cancel a job and re-run it later without any problems. It also means there will be wasted resources from tasks that timeout and need to be re-run as low priority. Unless you can identify fast and slow tasks in advance

    The first suggestion with the 100 queues could also have a problem if there are 100 slow tasks for a single user, then another user posts a batch of tasks. Those tasks won't get looked at until one of the slow tasks is finished. If this turns out to be a legitimate problem you could potentially combine the two solutions.

提交回复
热议问题