multiprocessing.Pool with maxtasksperchild produces equal PIDs

前端 未结 2 867
温柔的废话
温柔的废话 2021-01-02 04:55

I need to run a function in a process, which is completely isolated from all other memory, several times. I would like to use multiprocessing for that (since I

相关标签:
2条回答
  • 2021-01-02 05:48

    observe that using chunksize=1 in a Pool map will do the pool wait for a complete round of process to finish to start a new one.

    with Pool(3, maxtasksperchild=1) as p:
        p.map(do_job, args_list, chunksize=1)
    

    For example, above the pool will wait until all the first 3 process (eg 1000,1001,1002) finish to then start the new round(1003,1004,1005)

    0 讨论(0)
  • 2021-01-02 05:50

    You need to also specify chunksize=1 in the call to pool.map. Otherwise, multiple items in your iterable get bundled together into one "task" from the perception of the worker processes:

    import multiprocessing
    import time
    import os
    
    def f(x):
        print("PID: %d" % os.getpid())
        time.sleep(x)
        complex_obj = 5 #more complex axtually
        return complex_obj
    
    if __name__ == '__main__':
        multiprocessing.set_start_method('spawn')
        pool = multiprocessing.Pool(4, maxtasksperchild=1)
        pool.map(f, [5]*30, chunksize=1)
        pool.close()
    

    Output doesn't have repeated PIDs now:

    PID: 4912
    PID: 4913
    PID: 4914
    PID: 4915
    PID: 4938
    PID: 4937
    PID: 4940
    PID: 4939
    PID: 4966
    PID: 4965
    PID: 4970
    PID: 4971
    PID: 4991
    PID: 4990
    PID: 4992
    PID: 4993
    PID: 5013
    PID: 5014
    PID: 5012
    
    0 讨论(0)
提交回复
热议问题