Job queue optimization algorithms

问题

We have an application that requires assignment of jobs to resources. The resources have a number of attributes that define their suitability to a particular job -- some are preferences, some are hard constraints (all of the membership variety, e.g. "resource A is suited to jobs with color X, Y, or Z".

Resources have a cost associated with them (the duration they spend on-line). We have the ability to recruit resources -- this takes a variable amount of time. We can recruit for a fixed interval of time.

To give an idea of scale: There will be about 20 resources at any given time, 100 outstanding jobs. Completion of jobs takes 5-15 seconds. Recruiting a resource takes about 1-2 minutes, and we can recruit from 1-30 minutes of time (rerecruiting is allowed). We don't have much heads-up on jobs being submitted, maybe a few seconds.

The goal is completion of jobs with lowest cost (resource usage) for a given average latency (job completion time).

I'd appreciate pointers to algorithms, software libraries, or approaches to solving this problem.

回答1:

Might want to look into the knapsack problem or the bin packing problem as those are similar in principle to what you are trying to do here.

In your problem description you mention that the goal is the completion of jobs with the lowest latency. If that is actually your only goal, then the solution is simple - hire all available resources. Many of them will be idle much of the time, but it pretty much guarantees the lowest possible latency.

I suspect that your real goal though is to minimize both latency and idle resources as much as possible, so there will always be some tradeoff between latency and wasted resources in play here.

回答2:

This feels like a few things: Economic Order Quantity, balancing upfront recruitment cost with run cost; an LP or IP, minimizing a formula for overall cost subject to various constraints; and then there are the probability distributions (time to recruit; job resources required?), making the whole thing stochastic.

It sounds sufficiently complex that, if I were doing it, I would probably set up a simulation. The system doesn't seem too complicated to do that way, or too mathematically onerous to run for large numbers of iterations or long run time, so you can get some fairly stable and useful results.

回答3:

I would look at it this way... not sure if it covers everything.

1) A "resource" could actually be seen as a "workcenter". How many workcenters you have open to work on "jobs" is relative to who is signed into the system.

2) Assign resources by waiting time - the longer they have been waiting for a job, the higher they should be on the list for the next request. That way no one gets cold or slows down. You will have to find and set a threshold by which (relative to resources and jobs). You can decide if you want them to click to pick up their next job, or for the system to automatically get one in between jobs

3) Setup a Job Schedule queue - I don't know if it's relevant, but there might be high/low priority jobs, etc. You need a Pool of jobs, listed "by attack order." Each job on the job schedule can go through the different stages so you know where everything is at and know when you're done to move onto the next one. The Job scheduler will look for available resources and assign them to jobs on the schedule. This will likely be most of the brains of your scheduling system.

I just hope you're not building an outbound call center :P

回答4:

I'm not aware of the literature on problems like this. I assume there is some, though, since queueing theory is a large academic area, and this doesn't sound like a ridiculously contrived situation. Mind you, the fact that you care about average latency, rather than worst-case latency or Nth percentile latency, might put you in the minority.

My first instinct is that since there seem to be plenty of jobs around, a good solution would have several "flexible" workers continuously employed. This is a set of workers who, between them, can complete most types of common jobs with acceptable latency. The lower you want latency to be, the more resources in this set and the more time they spend idle. Also the more "bursty" your input is (assuming bursts are unpredictable), the more idle time you need in order to prevent high latency during bursts.

Then on two occasions you hire additional "specialised" workers:

1) A rare type of job comes in which your current set of hires can only handle at high time cost or not at all. So you hire (roughly speaking) whoever can shift it and then do the most possible of the rest of your queue.

2) No such job is in, but you spot an opportunity to hire someone who just so happens to be able to take some combination of jobs off the current queue, and do them more cheaply than your current hires but without leaving your current hires idle. So you hire that resource.

As for the actual algorithm: it's almost certainly not computationally feasible to find the best solution, so the right answer depends on processing resources and you're looking at heuristics and solving partial problems. As long as everyone you hire is busy, and you can't hire anyone else without causing significant idle time at some point in the future, you're probably in the vicinity of a good solution, and somewhere near the "most bang per buck" point of the latency/cost tradeoff. Hiring more resources after that gives diminishing returns, but that doesn't mean you're not willing to do it for a specified latency and/or specified budget.

But it depends what the incoming jobs look like: if you have a resource that can only do one type of job, and that job only comes in once a day/week/year, then it's probably better to hire them once a day than to wait until you have enough of that job to fill their minimum possible timeslice (which is why firefighters spend most of their time playing card games, whereas typists spend most of their time typing. There's always enough typing to keep at least one typist busy, but fires are rare. Furthermore, we don't want the "most bang per buck" solution for fires, we want lower latency than that). So there are probably opportunities to tweak the algorithm for your particular set of resources and pattern of incoming jobs, if you're solving one particular instance of the problem rather than writing general scheduling software.

Plus presumably if the resources are human beings, you can't actually guarantee that hiring succeeds. They aren't going to sit around all day only getting paid when there happens to be work on a minute-by-minute basis, are they?

回答5:

This problem can be viewed as a linear optimization problem, so this should be a start. I have used this library however it has quite a lot of other things, so it may be overkill. Instead, it is not difficult to develop your own library, this book has a good chapter on LP.

回答6:

I'm afraid I don't have an easy answer for you, but here are some more related resources to comb throughfor ideas.

On Multi-dimensional Packing Problems

A Vector-based Strategy for Dynamic Resource Allocation

回答7:

Awesome question!!

I would look into dynamic programming, linear optimization, and queueing theory. Unfortunately, there's no real easy way for me to answer these things. I do not have the mathematical expertise necessary to give you a solid, optimal answer for these things.

However, if you are keen on such things, this sounds like an opportunity to get a good, though likely not optimal, solution using an artificial intelligence algorithm. I would recomment either a genetic algorithm or a simulated annealer. Either of these will not take very long to code. The idea is that you pick random, valid inputs and you can tweak these potential solutions, evolving them into better ones (or picking new ones automatically) as time goes by. These are a good compromise between getting optimal answers and spending forever to code and prove correctness.

回答8:

This sounds very much like Real-Time Operating System Scheduling. Wikipedia details some of the algorithms that are used:

Cooperative scheduling
Round-robin scheduling

Preemptive scheduling
Fixed priority pre-emptive scheduling, an implementation of preemptive time slicing

Critical section preemptive scheduling

Static time scheduling

Earliest Deadline First approach

Advanced scheduling using the stochastic and MTG

回答9:

A few thoughts:

are there groups of jobs that can be grouped together - all having the same base requirements?
are there people that can also be groups together - all having the basic skills

If so, than you can reduce some of the complexity from the outset and then use some form of weighted averages for the preferences. Also, when you recruit, since the min. you can recruit for is 30 minutes, and assumption they are a higher cost, you probably want to make sure they have the highest utilization levels.

Here's some articles that might help:

Job Shop Scheduling - http://en.wikipedia.org/wiki/Job_Shop_Scheduling
Competative Analysis - http://en.wikipedia.org/wiki/Competitive_analysis_(online_algorithm)
k server problem - http://en.wikipedia.org/wiki/K-server_problem

来源：https://stackoverflow.com/questions/1033099/job-queue-optimization-algorithms

标签

algorithm

scheduling