multiprocessing.Pool: When to use apply, apply_async or map?

后端 未结 3 2165
遥遥无期
遥遥无期 2020-11-22 15:45

I have not seen clear examples with use-cases for Pool.apply, Pool.apply_async and Pool.map. I am mainly using Pool.map; what are the advantages of others?

3条回答
  •  刺人心
    刺人心 (楼主)
    2020-11-22 16:10

    Here is an overview in a table format in order to show the differences between Pool.apply, Pool.apply_async, Pool.map and Pool.map_async. When choosing one, you have to take multi-args, concurrency, blocking, and ordering into account:

                      | Multi-args   Concurrence    Blocking     Ordered-results
    ---------------------------------------------------------------------
    Pool.map          | no           yes            yes          yes
    Pool.map_async    | no           yes            no           yes
    Pool.apply        | yes          no             yes          no
    Pool.apply_async  | yes          yes            no           no
    Pool.starmap      | yes          yes            yes          yes
    Pool.starmap_async| yes          yes            no           no
    

    Notes:

    • Pool.imap and Pool.imap_async – lazier version of map and map_async.

    • Pool.starmap method, very much similar to map method besides it acceptance of multiple arguments.

    • Async methods submit all the processes at once and retrieve the results once they are finished. Use get method to obtain the results.

    • Pool.map(or Pool.apply)methods are very much similar to Python built-in map(or apply). They block the main process until all the processes complete and return the result.

    Examples:

    map

    Is called for a list of jobs in one time

    results = pool.map(func, [1, 2, 3])
    

    apply

    Can only be called for one job

    for x, y in [[1, 1], [2, 2]]:
        results.append(pool.apply(func, (x, y)))
    
    def collect_result(result):
        results.append(result)
    

    map_async

    Is called for a list of jobs in one time

    pool.map_async(func, jobs, callback=collect_result)
    

    apply_async

    Can only be called for one job and executes a job in the background in parallel

    for x, y in [[1, 1], [2, 2]]:
        pool.apply_async(worker, (x, y), callback=collect_result)
    

    starmap

    Is a variant of pool.map which support multiple arguments

    pool.starmap(func, [(1, 1), (2, 1), (3, 1)])
    

    starmap_async

    A combination of starmap() and map_async() that iterates over iterable of iterables and calls func with the iterables unpacked. Returns a result object.

    pool.starmap_async(calculate_worker, [(1, 1), (2, 1), (3, 1)], callback=collect_result)
    

    Reference:

    Find complete documentation here: https://docs.python.org/3/library/multiprocessing.html

提交回复
热议问题