multiprocessing pool.map call functions in certain order

ⅰ亾dé卋堺 提交于 2019-11-30 06:44:58

The reason that this occurs is because each process is given a predefined amount of work to do at the start of the call to map which is dependant on the chunksize. We can work out the default chunksize by looking at the source for pool.map

chunksize, extra = divmod(len(iterable), len(self._pool) * 4)
if extra:
  chunksize += 1

So for a range of 20, and with 4 processes, we will get a chunksize of 2.

If we modify your code to reflect this we should get similar results to the results you are getting now:

proc_pool.map(SomeFunc, range(num_calls), chunksize=2)

This yields the output:

0 2 6 4 1 7 5 3 8 10 12 14 9 13 15 11 16 18 17 19

Now, setting the chunksize=1 will ensure that each process within the pool will only be given one task at a time.

proc_pool.map(SomeFunc, range(num_calls), chunksize=1)

This should ensure a reasonably good numerical ordering compared to that when not specifying a chunksize. For example a chunksize of 1 yields the output:

0 1 2 3 4 5 6 7 9 10 8 11 13 12 15 14 16 17 19 18

What about changing map to imap:

import os
from multiprocessing import Pool
import time

num_proc = 4
num_calls = 20
sleeper = 0.1

def SomeFunc(arg):
    time.sleep(sleeper)
    print "%s %5d" % (os.getpid(), arg)
    return arg

proc_pool = Pool(num_proc)
list(proc_pool.imap(SomeFunc, range(num_calls)))

The reason maybe that the default chunksize of imap is 1, so it may not run as far as map.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!