Optimized algorithm to schedule tasks with dependency?

后端未结

关注

 5  2196

梦毁少年i 2021-01-30 17:50

There are tasks that read from a file, do some processing and write to a file. These tasks are to be scheduled based on the dependency. Also tasks can be run in parallel, so the

5条回答

暗喜 (楼主)

2021-01-30 18:49

Given a mapping between items, and items they depend on, a topological sort orders items so that no item precedes an item it depends upon.

This Rosetta code task has a solution in Python which can tell you which items are available to be processed in parallel.

Given your input the code becomes:

try:
    from functools import reduce
except:
    pass

data = { # From: http://stackoverflow.com/questions/18314250/optimized-algorithm-to-schedule-tasks-with-dependency
    # This   <-   This  (Reverse of how shown in question)
    'B':         set(['A']),
    'C':         set(['A']),
    'D':         set(['B']),
    'F':         set(['E']),
    }

def toposort2(data):
    for k, v in data.items():
        v.discard(k) # Ignore self dependencies
    extra_items_in_deps = reduce(set.union, data.values()) - set(data.keys())
    data.update({item:set() for item in extra_items_in_deps})
    while True:
        ordered = set(item for item,dep in data.items() if not dep)
        if not ordered:
            break
        yield ' '.join(sorted(ordered))
        data = {item: (dep - ordered) for item,dep in data.items()
                if item not in ordered}
    assert not data, "A cyclic dependency exists amongst %r" % data

print ('\n'.join( toposort2(data) ))

Which then generates this output:

A E
B C F
D

Items on one line of the output could be processed in any sub-order or, indeed, in parallel; just so long as all items of a higher line are processed before items of following lines to preserve the dependencies.

0 讨论(0)

查看其它5个回答