Using collections.namedtuple with ProcessPoolExecutor gets stuck in a few cases

北城余情 提交于 2021-02-17 05:56:45

问题


>>> import concurrent.futures
>>> from collections import namedtuple
>>> #1. Initialise namedtuple here
>>> # tm = namedtuple("tm", ["pk"])  
>>> class T:  
...     #2. Initialise named tuple here
...     #tm = namedtuple("tm", ["pk"]) 
...     def __init__(self): 
...         #3: Initialise named tuple here
...         tm = namedtuple("tm", ["pk"])                       
...         self.x = {'key': [tm('value')]}  
...     def test1(self):  
...         with concurrent.futures.ProcessPoolExecutor(max_workers=1) as executor:  
...             results = executor.map(self.test, ["key"])  
...         return results  
...     def test(self, s): 
...         print(self.x[s])   
... 
>>> t = T().test1()

This gets stuck here.

^CTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
Process ForkProcess-1:
  File "<stdin>", line 10, in test1
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/concurrent/futures/_base.py", line 623, in __exit__
    self.shutdown(wait=True)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/concurrent/futures/process.py", line 681, in shutdown
    self._queue_management_thread.join()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 1044, in join
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/concurrent/futures/process.py", line 233, in _process_worker
    call_item = call_queue.get(block=True)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/queues.py", line 94, in get
    res = self._recv_bytes()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
KeyboardInterrupt
    self._wait_for_tstate_lock()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 1060, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt

If I initialise the named tuple outside of the class (in #1), in that case, this works fine. Could someone please let me know what is the issue if I initialise as per #2 or #3 ?


回答1:


You're not changing where you initialize the namedtuple. You're changing where you create the namedtuple class.

When you create a namedtuple class named "x" in module "y" with collections.namedtuple, its __module__ is set to 'y' and its __qualname__ is set to 'x'. Pickling and unpickling relies on this class actually being available in the y.x location indicated by these attributes, but in cases 2 and 3 of your example, it's not.

Python can't pickle the namedtuple, which breaks inter-process communication with the workers. Executing self.test in a worker process relies on pickling self.test and unpickling a copy of it in the worker process, and that can't happen if self.x is an instance of a class that can't be pickled.



来源:https://stackoverflow.com/questions/63609985/using-collections-namedtuple-with-processpoolexecutor-gets-stuck-in-a-few-cases

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!