How to parallelize iteration over a range, using StdLib and Python 3?

前端 未结 3 1303
陌清茗
陌清茗 2020-12-11 21:19

I\'ve been searching for an answer on this now for days to no avail. I\'m probably just not understanding the pieces that are floating around out there and the Python docume

3条回答
  •  暗喜
    暗喜 (楼主)
    2020-12-11 21:46

    import timeit
    
    from multiprocessing import Pool
    
    def appendNumber(x):
        return x
    
    start = timeit.default_timer()
    
    with Pool(4) as p:
        numbers = p.map(appendNumber, range(100000000))
    
    end = timeit.default_timer()
    
    print('TIME: {} seconds'.format(end - start))
    print('SUM:', sum(numbers))
    

    So Pool.map is like the builtin map function. It takes a function and an iterable and produces a list of the result of calling that function on every element of the iterable. Here since we don't actually want to change the elements in the range iterable we just return the argument.

    The crucial thing is that Pool.map divides up the provided iterable (range(1000000000) here) into chunks and sends them to the number of processes it has (defined here as 4 in Pool(4)) then rejoins the results back into one list.

    The output I get when running this is

    TIME: 8.748245699999984 seconds
    SUM: 4999999950000000
    

提交回复
热议问题