Function that multiprocesses another function

吃可爱长大的小学妹 提交于 2019-12-05 18:06:46

One of the basic concepts for Python multi-processing is using queues. It works quite well when you have an input list that can be iterated and which does not need to be altered by the sub-processes. It also gives you a good control over all the processes, because you spawn the number you want, you can run them idle or stop them.

It is also a lot easier to debug. Sharing data explicitly is usually an approach that is much more difficult to setup correctly.

Queues can hold anything as they are iterables by definition. So you can fill them with filepath strings for reading files, non-iterable numbers for doing calculations or even images for drawing.

In your case a layout could look like that:

import multiprocessing as mp
import numpy as np
import itertools as it


def worker1(in_queue, out_queue):
    #holds when nothing is available, stops when 'STOP' is seen
    for a in iter(in_queue.get, 'STOP'):
        #do something
        out_queue.put({a: result}) #return your result linked to the input

def worker2(in_queue, out_queue):
    for a in iter(in_queue.get, 'STOP'):
        #do something differently
        out_queue.put({a: result}) //return your result linked to the input

def multiprocess_loop_grouped(function, param_list, group_size, Nworkers, *args):
    # your final result
    result = {}

    in_queue = mp.Queue()
    out_queue = mp.Queue()

    # fill your input
    for a in param_list:
        in_queue.put(a)
    # stop command at end of input
    for n in range(Nworkers):
        in_queue.put('STOP')

    # setup your worker process doing task as specified
    process = [mp.Process(target=function,
               args=(in_queue, out_queue), daemon=True) for x in range(Nworkers)]

    # run processes
    for p in process:
        p.start()

    # wait for processes to finish
    for p in process:
        p.join()

    # collect your results from the calculations
    for a in param_list:
        result.update(out_queue.get())

    return result

temp = multiprocess_loop_grouped(worker1, param_list, group_size, Nworkers, *args)
map = multiprocess_loop_grouped(worker2, param_list, group_size, Nworkers, *args)

It can be made a bit more dynamic when you are afraid that your queues will run out of memory. Than you need to fill and empty the queues while the processes are running. See this example here.

Final words: it is not more Pythonic as you requested. But it is easier to understand for a newbie ;-)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!