Stopping processes in ThreadPool in Python

时光总嘲笑我的痴心妄想 提交于 2019-12-05 05:07:23

This is a very interesting use of parallelism.

However, if you are using multiprocessing, the goal is to have many processes running in parallel, as opposed to one process running many threads.

Consider these few changes to implement it using multiprocessing:

You have these functions that will run in parallel:

import time
import multiprocessing as mp


def some_long_task_from_library(wtime):
    time.sleep(wtime)


class MyException(Exception): pass

def do_other_stuff_for_a_bit():
    time.sleep(5)
    raise MyException("Something Happened...")

Let's create and start the processes, say 4:

procs = []  # this is not a Pool, it is just a way to handle the
            # processes instead of calling them p1, p2, p3, p4...
for _ in range(4):
    p = mp.Process(target=some_long_task_from_library, args=(1000,))
    p.start()
    procs.append(p)
mp.active_children()   # this joins all the started processes, and runs them.

The processes are running in parallel, presumably in a separate cpu core, but that is to the OS to decide. You can check in your system monitor.

In the meantime you run a process that will break, and you want to stop the running processes, not leaving them orphan:

try:
    do_other_stuff_for_a_bit()
except MyException as exc:
    print(exc)
    print("Now stopping all processes...")
    for p in procs:
        p.terminate()
print("The rest of the process will continue")

If it doesn't make sense to continue with the main process when one or all of the subprocesses have terminated, you should handle the exit of the main program.

Hope it helps, and you can adapt bits of this for your library.

SRD

In answer to the question of why pool did not work then this is due to (as quoted in the Documentation) then main needs to be importable by the child processes and due to the nature of this project interactive python is being used.

At the same time it was not clear why ThreadPool would - although the clue is right there in the name. ThreadPool creates its pool of worker processes using multiprocessing.dummy which as noted here is just a wrapper around the Threading module. Pool uses the multiprocessing.Process. This can be seen by this test:

p=ThreadPool(processes=3)
p._pool[0]
<DummyProcess(Thread23, started daemon 12345)> #no terminate() method

p=Pool(processes=3)
p._pool[0]
<Process(PoolWorker-1, started daemon)> #has handy terminate() method if needed

As threads do not have a terminate method the worker threads carry on running until they have completed their current task. Killing threads is messy (which is why I tried to use the multiprocessing module) but solutions are here.

The one warning about the solution using the above:

def wrapper(a,target,q,args=(),kwargs={}):
    '''Used when return value is wanted'''
    q.put(getattr(a,target)(*args,**kwargs))

is that changes to attributes inside the instance of the object are not passed back up to the main program. As an example the class foo above can also have methods such as: def addIP(newIP): self.hardwareIP=newIP A call to r=mp.Process(target=a.addIP,args=(127.0.0.1)) does not update a.

The only way round this for a complex object seems to be shared memory using a custom manager which can give access to both the methods and attributes of object a For a very large complex object based on a library this may be best done using dir(foo) to populate the manager. If I can figure out how I'll update this answer with an example (for my future self as much as others).

If for some reasons using threads is preferable, we can use this.

We can send some siginal to the threads we want to terminate. The simplest siginal is global variable:

import time
from multiprocessing.pool import ThreadPool

_FINISH = False

def hang():
    while True:
        if _FINISH:
            break
        print 'hanging..'
        time.sleep(10)


def main():
    global _FINISH
    pool = ThreadPool(processes=1)
    pool.apply_async(hang)
    time.sleep(10)
    _FINISH = True
    pool.terminate()
    pool.join()
    print 'main process exiting..'


if __name__ == '__main__':
    main()
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!