问题
Ok, since there are currently no answer's I don't feel too bad doing this. While I'm still interested in what is actually happening behind the scenes to cause this problem, my most urgent questions are those specified in update 2. Those being,
What are the differences between a JoinableQueue
and a Manager().Queue()
(and when should you use one over the other?). And importantly, is it safe to replace one for the other, in this example?
In the following code, I have a simple process pool. Each process is passed the process queue (pq
) to pull data to be processed from, and a return-value queue (rq
) to pass the returned values of the processing back to the main thread. If I don't append to the return-value queue it works, but as soon as I do, for some reason the processes are blocked from stopping. In both cases the processes run
methods return, so it's not put
on the return-queue blocking, but in the second case the processes themselves do not terminate, so the program deadlocks when I join
on the processes. Why would this be?
Updates:
It seems to have something to with the number of items in the queue.
On my machine at least, I can have up to 6570 items in the queue and it actually works, but any more than this and it deadlocks.It seems to work with
Manager().Queue()
.
Whether it's a limitation ofJoinableQueue
or just me misunderstanding the differences between the two objects, I've found that if I replace the return queue with aManager().Queue()
, it works as expected. What are the differences between them, and when should you use one over the other?The error does not occur if I'm consuming from
rq
Oop. There was an answer here for a moment, and as I was commenting on it, it disappeared. Anyway one of the things it said was questioning whether, if I add a consumer this error still occurs. I have tried this, and the answer is, no it doesn't.The other thing it mentioned was this quote from the multiprocessing docs as a possible key to the problem. Referring to
JoinableQueue
's, it says:... the semaphore used to count the number of unfinished tasks may eventually overflow raising an exception.
import multiprocessing
class _ProcSTOP:
pass
class Proc(multiprocessing.Process):
def __init__(self, pq, rq):
self._pq = pq
self._rq = rq
super().__init__()
print('++', self.name)
def run(self):
dat = self._pq.get()
while not dat is _ProcSTOP:
# self._rq.put(dat) # uncomment me for deadlock
self._pq.task_done()
dat = self._pq.get()
self._pq.task_done()
print('==', self.name)
def __del__(self):
print('--', self.name)
if __name__ == '__main__':
pq = multiprocessing.JoinableQueue()
rq = multiprocessing.JoinableQueue()
pool = []
for i in range(4):
p = Proc(pq, rq)
p.start()
pool.append(p)
for i in range(10000):
pq.put(i)
pq.join()
for i in range(4):
pq.put(_ProcSTOP)
pq.join()
while len(pool) > 0:
print('??', pool)
pool.pop().join() # hangs here (if using rq)
print('** complete')
Sample output, not using return-queue:
++ Proc-1
++ Proc-2
++ Proc-3
++ Proc-4
== Proc-4
== Proc-3
== Proc-1
?? [<Proc(Proc-1, started)>, <Proc(Proc-2, started)>, <Proc(Proc-3, started)>, <Proc(Proc-4, started)>]
== Proc-2
?? [<Proc(Proc-1, stopped)>, <Proc(Proc-2, started)>, <Proc(Proc-3, stopped)>]
-- Proc-3
?? [<Proc(Proc-1, stopped)>, <Proc(Proc-2, started)>]
-- Proc-2
?? [<Proc(Proc-1, stopped)>]
-- Proc-1
** complete
-- Proc-4
Sample output, using return queue:
++ Proc-1
++ Proc-2
++ Proc-3
++ Proc-4
== Proc-2
== Proc-4
== Proc-1
?? [<Proc(Proc-1, started)>, <Proc(Proc-2, started)>, <Proc(Proc-3, started)>, <Proc(Proc-4, started)>]
== Proc-3
# here it hangs
回答1:
From the documentation:
Warning
As mentioned above, if a child process has put items on a queue (and it has not used JoinableQueue.cancel_join_thread()), then that process will not terminate until all buffered items have been flushed to the pipe.
This means that if you try joining that process you may get a deadlock unless you are sure that all items which have been put on the queue have been consumed. Similarly, if the child process is non-daemonic then the parent process may hang on exit when it tries to join all its non-daemonic children.
Note that a queue created using a manager does not have this issue. See Programming guidelines.
So the JoinableQueue() uses a pipe and will wait until it can flush all data before closing.
On the other hand a Manager.Queue() object uses a completely different approach. Managers are running a separate process that receive all data immediately (and store it in its memory).
Managers provide a way to create data which can be shared between different processes. A manager object controls a server process which manages shared objects. Other processes can access the shared objects by using proxies.
...
Queue([maxsize]) Create a shared Queue.Queue object and return a proxy for it.
来源:https://stackoverflow.com/questions/8026050/subprocess-completes-but-still-doesnt-terminate-causing-deadlock