A generic priority queue for Python

后端 未结 12 468
面向向阳花
面向向阳花 2020-12-13 03:27

I need to use a priority queue in my Python code, and:

  • am looking for any fast implementations for priority queues
  • optimally, I\'d li
相关标签:
12条回答
  • 2020-12-13 03:59

    This is efficient and works for strings or any type input as well -:)

    import itertools
    from heapq import heappush, heappop
    
    pq = []                         # list of entries arranged in a heap
    entry_finder = {}               # mapping of tasks to entries
    REMOVED = '<removed-task>'      # placeholder for a removed task
    counter = itertools.count()     # unique sequence count
    
    def add_task(task, priority=0):
        'Add a new task or update the priority of an existing task'
        if task in entry_finder:
            remove_task(task)
        count = next(counter)
        entry = [priority, count, task]
        entry_finder[task] = entry
        heappush(pq, entry)
    
    def remove_task(task):
        'Mark an existing task as REMOVED.  Raise KeyError if not found.'
        entry = entry_finder.pop(task)
        entry[-1] = REMOVED
    
    def pop_task():
        'Remove and return the lowest priority task. Raise KeyError if empty.'
        while pq:
            priority, count, task = heappop(pq)
            if task is not REMOVED:
                del entry_finder[task]
                return task
        raise KeyError('pop from an empty priority queue')
    

    Reference: http://docs.python.org/library/heapq.html

    0 讨论(0)
  • 2020-12-13 04:05

    If you want to keep an entire list ordered, not just the top value, I've used some variation of this code in multiple projects, it's a drop in replacement for the standard list class with a similar api:

    import bisect
    
    class OrderedList(list):
        """Keep a list sorted as you append or extend it
    
        An ordered list, this sorts items from smallest to largest using key, so
        if you want MaxQueue like functionality use negative values: .pop(-1) and
        if you want MinQueue like functionality use positive values: .pop(0)
        """
        def __init__(self, iterable=None, key=None):
            if key:
                self.key = key
            self._keys = []
            super(OrderedList, self).__init__()
            if iterable:
                for x in iterable:
                    self.append(x)
    
        def key(self, x):
            return x
    
        def append(self, x):
            k = self.key(x)
            # https://docs.python.org/3/library/bisect.html#bisect.bisect_right
            i = bisect.bisect_right(self._keys, k)
            if i is None:
                super(OrderedList, self).append((self.key(x), x))
                self._keys.append(k)
            else:
                super(OrderedList, self).insert(i, (self.key(x), x))
                self._keys.insert(i, k)
    
        def extend(self, iterable):
            for x in iterable:
                self.append(x)
    
        def remove(self, x):
            k = self.key(x)
            self._keys.remove(k)
            super(OrderedList, self).remove((k, x))
    
        def pop(self, i=-1):
            self._keys.pop(i)
            return super(OrderedList, self).pop(i)[-1]
    
        def clear(self):
            super(OrderedList, self).clear()
            self._keys.clear()
    
        def __iter__(self):
            for x in super(OrderedList, self).__iter__():
                yield x[-1]
    
        def __getitem__(self, i):
            return super(OrderedList, self).__getitem__(i)[-1]
    
        def insert(self, i, x):
            raise NotImplementedError()
        def __setitem__(self, x):
            raise NotImplementedError()
        def reverse(self):
            raise NotImplementedError()
        def sort(self):
            raise NotImplementedError()
    

    It can handle tuples like (priority, value) by default but you can also customize it like this:

    class Val(object):
        def __init__(self, priority, val):
            self.priority = priority
            self.val = val
    
    h = OrderedList(key=lambda x: x.priority)
    
    h.append(Val(100, "foo"))
    h.append(Val(10, "bar"))
    h.append(Val(200, "che"))
    
    print(h[0].val) # "bar"
    print(h[-1].val) # "che"
    
    0 讨论(0)
  • 2020-12-13 04:09

    I've not used it, but you could try PyHeap. It's written in C so hopefully it is fast enough for you.

    Are you positive heapq/PriorityQueue won't be fast enough? It might be worth going with one of them to start, and then profiling to see if it really is your performance bottlneck.

    0 讨论(0)
  • 2020-12-13 04:09

    A simple implement:

    since PriorityQueue is lower first.

    from queue import PriorityQueue
    
    
    class PriorityQueueWithKey(PriorityQueue):
        def __init__(self, key=None, maxsize=0):
            super().__init__(maxsize)
            self.key = key
    
        def put(self, item):
            if self.key is None:
                super().put((item, item))
            else:
                super().put((self.key(item), item))
    
        def get(self):
            return super().get(self.queue)[1]
    
    
    a = PriorityQueueWithKey(abs)
    a.put(-4)
    a.put(-3)
    print(*a.queue)
    
    0 讨论(0)
  • 2020-12-13 04:11

    If you only have a single "higher priority" level rather than arbitrarily many as supported by queue.PriorityQueue, you can efficiently use a collections.deque for this by inserting normal jobs at the left .appendleft(), and inserting your higher-priority entries at the right .append()

    Both queue and deque instances have threadsafe push/pop methods

    Misc advantages to Deques

    • allows peeking arbitrary elements (indexable and iterable without popping, while queue instances can only be popped)
    • significantly faster than queue.PriorityQueue (see sketchy testing below)

    Cautions about length limitations

    • setting a length will let it push elements out of either end, not just off the left, unlike queue instances, which block or raise queue.Full
    • any unbounded collection will eventually run your system out of memory if input rate exceeds consumption
    import threading
    from collections import deque as Deque
    
    Q = Deque()  # don't set a maximum length
    
    def worker_queue_creator(q):
        sleepE = threading.Event()  # use wait method for sleeping thread
        sleepE.wait(timeout=1)
    
        for index in range(3):  # start with a few jobs
            Q.appendleft("low job {}".format(index))
    
        Q.append("high job 1")  # add an important job
    
        for index in range(3, 3+3):  # add a few more jobs
            Q.appendleft("low job {}".format(index))
    
        # one more important job before ending worker
        sleepE.wait(timeout=2)
        Q.append("high job 2")
    
        # wait while the consumer worker processes these before exiting
        sleepE.wait(timeout=5)
    
    def worker_queue_consumer(q):
        """ daemon thread which consumes queue forever """
        sleepE = threading.Event()  # use wait method for sleeping thread
        sleepE.wait(timeout=1)  # wait a moment to mock startup
        while True:
            try:
                pre_q_str = str(q)  # see what the Deque looks like before before pop
                job = q.pop()
            except IndexError:  # Deque is empty
                pass            # keep trying forever
            else:  # successfully popped job
                print("{}: {}".format(job, pre_q_str))
            sleepE.wait(timeout=0.4)  # quickly consume jobs
    
    
    # create threads to consume and display the queue
    T = [
        threading.Thread(target=worker_queue_creator, args=(Q,)),
        threading.Thread(target=worker_queue_consumer, args=(Q,), daemon=True),
    ]
    
    for t in T:
        t.start()
    
    T[0].join()  # wait on sleep in worker_queue_creator to quit
    
    % python3 deque_as_priorityqueue.py
    high job 1: deque(['low job 5', 'low job 4', 'low job 3', 'low job 2', 'low job 1', 'low job 0', 'high job 1'])
    low job 0: deque(['low job 5', 'low job 4', 'low job 3', 'low job 2', 'low job 1', 'low job 0'])
    low job 1: deque(['low job 5', 'low job 4', 'low job 3', 'low job 2', 'low job 1'])
    low job 2: deque(['low job 5', 'low job 4', 'low job 3', 'low job 2'])
    low job 3: deque(['low job 5', 'low job 4', 'low job 3'])
    high job 2: deque(['low job 5', 'low job 4', 'high job 2'])
    low job 4: deque(['low job 5', 'low job 4'])
    low job 5: deque(['low job 5'])
    

    Comparison

    import timeit
    
    NUMBER = 1000
    
    values_builder = """
    low_priority_values  = [(1, "low-{}".format(index)) for index in range(5000)]
    high_priority_values = [(0, "high-{}".format(index)) for index in range(1000)]
    """
    
    deque_setup = """
    from collections import deque as Deque
    Q = Deque()
    """
    deque_logic_input = """
    for item in low_priority_values:
        Q.appendleft(item[1])  # index into tuples to remove priority
    for item in high_priority_values:
        Q.append(item[1])
    """
    deque_logic_output = """
    while True:
        try:
            v = Q.pop()
        except IndexError:
            break
    """
    
    queue_setup = """
    from queue import PriorityQueue
    from queue import Empty
    Q = PriorityQueue()
    """
    queue_logic_input = """
    for item in low_priority_values:
        Q.put(item)
    for item in high_priority_values:
        Q.put(item)
    """
    
    queue_logic_output = """
    while True:
        try:
            v = Q.get_nowait()
        except Empty:
            break
    """
    
    # abuse string catenation to build the setup blocks
    results_dict = {
        "deque input":  timeit.timeit(deque_logic_input, setup=deque_setup+values_builder, number=NUMBER),
        "queue input":  timeit.timeit(queue_logic_input, setup=queue_setup+values_builder, number=NUMBER),
        "deque output": timeit.timeit(deque_logic_output, setup=deque_setup+values_builder+deque_logic_input, number=NUMBER),
        "queue output": timeit.timeit(queue_logic_output, setup=queue_setup+values_builder+queue_logic_input, number=NUMBER),
    }
    
    for k, v in results_dict.items():
        print("{}: {}".format(k, v))
    

    Results (6000 elements pushed and popped, timeit number=1000)

    % python3 deque_priorityqueue_compare.py
    deque input: 0.853059
    queue input: 24.504084000000002
    deque output: 0.0013576999999997952
    queue output: 0.02025689999999969
    

    While this is a fabricated example to show off deque's performance, PriorityQueue's insert time is some significant function of its length and O(log n) or worse, while a Deque is O(1), so it should be fairly representative of a real use case

    0 讨论(0)
  • 2020-12-13 04:13

    When using a priority queue, decrease-key is a must-have operation for many algorithms (Dijkstra's Algorithm, A*, OPTICS), I wonder why Python's built-in priority queue does not support it. None of the other answers supply a solution that supports this functionality.

    A priority queue which also supports decrease-key operation is this implementation by Daniel Stutzbach worked perfectly for me with Python 3.5.

    from heapdict import heapdict
    
    hd = heapdict()
    hd["two"] = 2
    hd["one"] = 1
    obj = hd.popitem()
    print("object:",obj[0])
    print("priority:",obj[1])
    
    # object: one
    # priority: 1
    
    0 讨论(0)
提交回复
热议问题