How to use multiprocessing.Pool in an imported module?

后端 未结 2 1571
灰色年华
灰色年华 2021-01-05 07:09

I have not been able to implement the suggestion here: Applying two functions to two lists simultaneously.

I guess it is because the module is imported by another mo

2条回答
  •  孤独总比滥情好
    2021-01-05 07:48

    The reason you need to guard multiprocessing code in a if __name__ == "__main__" is that you don't want it to run again in the child process. That can happen on Windows, where the interpreter needs to reload all of its state since there's no fork system call that will copy the parent process's address space. But you only need to use it where code is supposed to be running at the top level since you're in the main script. It's not the only way to guard your code.

    In your specific case, I think you should put the multiprocessing code in a function. That won't run in the child process, as long as nothing else calls the function when it should not. Your main module can import the module, then call the function (from within an if __name__ == "__main__" block, probably).

    It should be something like this:

    some_module.py:

    def process_males(x):
        ...
    
    def process_females(x):
        ...
    
    args_m = [...] # these could be defined inside the function below if that makes more sense
    args_f = [...]
    
    def do_stuff():
        with mp.Pool(processes=(mp.cpu_count() - 1)) as p:
            p.map_async(process_males, args_m)
            p.map_async(process_females, args_f)
    

    main.py:

    import some_module
    
    if __name__ == "__main__":
        some_module.do_stuff()
    

    In your real code you might want to pass some arguments or get a return value from do_stuff (which should also be given a more descriptive name than the generic one I've used in this example).

提交回复
热议问题