Two functions in parallel with multiple arguments and return values

做~自己de王妃 提交于 2021-02-17 19:42:37

问题


I've got two separate functions. Each of them takes quite a long time to execute.

def function1(arg):
     do_some_stuff_here
     return result1

def function2(arg1, arg2, arg3):
     do_some_stuff_here
     return result2

I'd like to launch them in parallel, get their results (knowing which is which) and process the results afterwards. For what I've understood, multiprocessing is more efficient than Threading in Python 2.7 (GIL related issue). However I'm a bit lost whether it is better to use Process, Pool or Queue and how to implement them in a correct pythonic way for my use case.

Any help appreciated ;)


回答1:


First of all, Process, Pool and Queue all have different use case.

Process is used to spawn a process by creating the Process object.

from multiprocessing import Process

def method1():
    print "in method1"
    print "in method1"

def method2():
    print "in method2"
    print "in method2"

p1 = Process(target=method1) # create a process object p1
p1.start()                   # starts the process p1
p2 = Process(target=method2)
p2.start()

Pool is used to parallelize execution of function across multiple input values.

from multiprocessing import Pool

def method1(x):
    print x
    print x**2
    return x**2

p = Pool(3)
result = p.map(method1, [1,4,9]) 
print result          # prints [1, 16, 81]

Queue is used to communicate between processes.

from multiprocessing import Process, Queue

def method1(x, l1):
    print "in method1"
    print "in method1"
    l1.put(x**2)
    return x

def method2(x, l2):
    print "in method2"
    print "in method2"
    l2.put(x**3)
    return x

l1 = Queue()
p1 = Process(target=method1, args=(4, l1, ))  
l2 = Queue()
p2 = Process(target=method2, args=(2, l2, )) 
p1.start()   
p2.start()      
print l1.get()          # prints 16
print l2.get()          # prints 8

Now, for your case you can use Process & Queue(3rd method) or you can manipulate the pool method to work (below)

import itertools
from multiprocessing import Pool
import sys

def method1(x):         
    print x
    print x**2
    return x**2

def method2(x):        
    print x
    print x**3
    return x**3

def unzip_func(a, b):  
    return a, b    

def distributor(option_args):
    option, args = unzip_func(*option_args)    # unzip option and args 

    attr_name = "method" + str(option)            
    # creating attr_name depending on option argument

    value = getattr(sys.modules[__name__], attr_name)(args) 
    # call the function with name 'attr_name' with argument args

    return value


option_list = [1,2]      # for selecting the method number
args_list = [4,2]        
# list of arg for the corresponding method, (argument 4 is for method1)

p = Pool(3)              # creating pool of 3 processes

result = p.map(distributor, itertools.izip(option_list, args_list)) 
# calling the distributor function with args zipped as (option1, arg1), (option2, arg2) by itertools package
print result             # prints [16,8]

Hope this helps.




回答2:


This is another example I just found, hope it helps, nice and easy ;)

from multiprocessing import Pool

def square(x):
    return x * x

def cube(y):
    return y * y * y

pool = Pool(processes=20)

result_squares = pool.map_async(square, range(10))
result_cubes = pool.map_async(cube, range(10))

print result_squares.get(timeout=3)
print result_cubes.get(timeout=3)



回答3:


Here is an example of:

  1. Run a single function with multiple inputs in parallel using a Pool (square function) Interesting Side Note the mangled op on lines for "5 981 25"
  2. Run multiple functions with different inputs (Both args and kwargs) and collect their results using a Pool (pf1, pf2, pf3 functions)
import datetime
import multiprocessing
import time
import random

from multiprocessing import Pool

def square(x):
    # calculate the square of the value of x
    print(x, x*x)
    return x*x

def pf1(*args, **kwargs):
    sleep_time = random.randint(3, 6)
    print("Process : %s\tFunction : %s\tArgs: %s\tsleeping for %d\tTime : %s\n" % (multiprocessing.current_process().name, "pf1", args, sleep_time, datetime.datetime.now()))
    print("Keyword Args from pf1: %s" % kwargs)
    time.sleep(sleep_time)
    print(multiprocessing.current_process().name, "\tpf1 done at %s\n" % datetime.datetime.now())
    return (sum(*args), kwargs)

def pf2(*args):
    sleep_time = random.randint(7, 10)
    print("Process : %s\tFunction : %s\tArgs: %s\tsleeping for %d\tTime : %s\n" % (multiprocessing.current_process().name, "pf2", args, sleep_time, datetime.datetime.now()))
    time.sleep(sleep_time)
    print(multiprocessing.current_process().name, "\tpf2 done at %s\n" % datetime.datetime.now())
    return sum(*args)

def pf3(*args):
    sleep_time = random.randint(0, 3)
    print("Process : %s\tFunction : %s\tArgs: %s\tsleeping for %d\tTime : %s\n" % (multiprocessing.current_process().name, "pf3", args, sleep_time, datetime.datetime.now()))
    time.sleep(sleep_time)
    print(multiprocessing.current_process().name, "\tpf3 done at %s\n" % datetime.datetime.now())
    return sum(*args)

def smap(f, *arg):
    if len(arg) == 2:
        args, kwargs = arg
        return f(list(args), **kwargs)
    elif len(arg) == 1:
        args = arg
        return f(*args)


if __name__ == '__main__':

    # Define the dataset
    dataset = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

    # Output the dataset
    print ('Dataset: ' + str(dataset))

    # Run this with a pool of 5 agents having a chunksize of 3 until finished
    agents = 5
    chunksize = 3
    with Pool(processes=agents) as pool:
        result = pool.map(square, dataset)
    print("Result of Squares : %s\n\n" % result)
    with Pool(processes=3) as pool:
        result = pool.starmap(smap, [(pf1, [1,2,3], {'a':123, 'b':456}), (pf2, [11,22,33]), (pf3, [111,222,333])])

    # Output the result
    print ('Result: %s ' % result)


Output:
*******

Dataset: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
1 1
2 4
3 9
4 16
6 36
7 49
8 64
59 81
 25
10 100
11 121
12 144
13 169
14 196
Result of Squares : [1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196]


Process : ForkPoolWorker-6  Function : pf1  Args: ([1, 2, 3],)  sleeping for 3  Time : 2020-07-20 00:51:56.477299

Keyword Args from pf1: {'a': 123, 'b': 456}
Process : ForkPoolWorker-7  Function : pf2  Args: ([11, 22, 33],)   sleeping for 8  Time : 2020-07-20 00:51:56.477371

Process : ForkPoolWorker-8  Function : pf3  Args: ([111, 222, 333],)    sleeping for 1  Time : 2020-07-20 00:51:56.477918

ForkPoolWorker-8    pf3 done at 2020-07-20 00:51:57.478808

ForkPoolWorker-6    pf1 done at 2020-07-20 00:51:59.478877

ForkPoolWorker-7    pf2 done at 2020-07-20 00:52:04.478016

Result: [(6, {'a': 123, 'b': 456}), 66, 666] 

Process finished with exit code 0




来源:https://stackoverflow.com/questions/43474923/two-functions-in-parallel-with-multiple-arguments-and-return-values

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!