Python Multiprocessing: Crash in subprocess?

前端 未结 1 1707
误落风尘
误落风尘 2020-12-06 14:10

What happens when a python script opens subprocesses and one process crashes?

https://stackoverflow.com/a/18216437/311901

Will the main process crash?

相关标签:
1条回答
  • 2020-12-06 14:44

    When using multiprocessing.Pool, if one of the subprocesses in the pool crashes, you will not be notified at all, and a new process will immediately be started to take its place:

    >>> import multiprocessing
    >>> p = multiprocessing.Pool()
    >>> p._processes
    4
    >>> p._pool
    [<Process(PoolWorker-1, started daemon)>, <Process(PoolWorker-2, started daemon)>, <Process(PoolWorker-3, started daemon)>, <Process(PoolWorker-4, started daemon)>]
    >>> [proc.pid for proc in p._pool]
    [30760, 30761, 30762, 30763]
    

    Then in another window:

    dan@dantop:~$ kill 30763
    

    Back to the pool:

    >>> [proc.pid for proc in p._pool]
    [30760, 30761, 30762, 30767]  # New pid for the last process
    

    You can continue using the pool as if nothing happened. However, any work item that the killed child process was running at the time it died will not be completed or restarted. If you were running a blocking map or apply call that was relying on that work item to complete, it will likely hang indefinitely. There is a bug filed for this, but the issue was only fixed in concurrent.futures.ProcessPoolExecutor, rather than in multiprocessing.Pool. Starting with Python 3.3, ProcessPoolExecutor will raise a BrokenProcessPool exception if a child process is killed, and disallow any further use of the pool. Sadly, multiprocessing didn't get this enhancement. For now, if you want to guard against a pool call blocking forever due to a sub-process crashing, you have to use ugly workarounds.

    Note: The above only applies to a process in a pool actually crashing, meaning the process completely dies. If a sub-process raises an exception, that will be propagated up the parent process when you try to retrieve the result of the work item:

    >>> def f(): raise Exception("Oh no")
    ... 
    >>> pool = multiprocessing.Pool()
    >>> result = pool.apply_async(f)
    >>> result.get()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python2.7/multiprocessing/pool.py", line 528, in get
        raise self._value
    Exception: Oh no
    

    When using a multiprocessing.Process directly, the process object will show that the process has exited with a non-zero exit code if it crashes:

    >>> def f(): time.sleep(30)
    ... 
    >>> p = multiprocessing.Process(target=f)
    >>> p.start()
    >>> p.join()  # Kill the process while this is blocking, and join immediately ends
    >>> p.exitcode
    -15
    

    The behavior is similar if an exception is raised:

    from multiprocessing import Process
    
    def f(x):
        raise Exception("Oh no")
    
    if __name__ == '__main__':
        p = Process(target=f)
        p.start()
        p.join()
        print(p.exitcode)
        print("done")
    

    Output:

    Process Process-1:
    Traceback (most recent call last):
      File "/usr/lib/python3.2/multiprocessing/process.py", line 267, in _bootstrap
        self.run()
      File "/usr/lib/python3.2/multiprocessing/process.py", line 116, in run
        self._target(*self._args, **self._kwargs)
    TypeError: f() takes exactly 1 argument (0 given)
    1
    done
    

    As you can see, the traceback from the child is printed, but it doesn't affect exceution of the main process, which is able to show the exitcode of the child was 1.

    0 讨论(0)
提交回复
热议问题