Post-processing results after multi-processing in Python

徘徊边缘 提交于 2019-12-11 11:01:16

问题


So I have a simple MP code and it works like a charm. However, when I do a very simple post processing on the data generated via MP, the code does not work anymore. It never stops and runs like forever! This is the code (and again it works perfectly):

import numpy as np
from multiprocessing import Pool

n = 4
nMCS = 10**5

def my_function(j):
    result = []
    for j in range(nMCS // n):
        a = np.random.rand(10,2)
        result.append(a) 
    return result

if __name__ == '__main__':
    __spec__ = "ModuleSpec(name='builtins', loader=<class '_frozen_importlib.BuiltinImporter'>)" # this is because I am using Spyder!

    pool = Pool(processes = n) 

    data = pool.map(my_function, [i for i in range(n)])

    pool.close()
    pool.join()

#final_result = np.concatenate(data)   ### this is what ruins my code! ###

Meanwhile, if I add final_result = np.concatenate(data) at the end, it never works! I am using Spyder and if I simply type final_result = np.concatenate(data) in the console AFTER MP is done, it gives me what I want i.e. a concatenated list. However, if I put that simple line in the main program at the very end, it just doesn't work. Could anyone tell me how to fix this?

P.S. this is a very simple example I generated so you can understand what is going on; my real problem is way more complicated and there is no way I can do post processing after I am done with MP.


回答1:


As @Ares already implied, you fix the problem by indenting everything south the if __name__ == "__main__"-statement into the if-block.

FYI, this happens on Windows which doesn't provide forking for starting up new processes like Unix-y systems, but uses 'spawn' as default (and only) start-method. Spawn means, the OS has to boot a new process with an interpreter from scratch for every worker-process.

Your worker-processes will need to import your target function my_function. When this happens, everything not protected within the if __name__ == "__main__":-block will also run in every child-process on import.




回答2:


Your problem is that when you run np.concatenate, it's not done in the main function. I suspect that the problem you're encountering is Spyder specific, but updating the indentation should fix it.



来源:https://stackoverflow.com/questions/53195581/post-processing-results-after-multi-processing-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!