How to manage scope using multiprocessing

我的未来我决定 提交于 2019-12-13 03:54:43

问题


I'm trying to implement a function that uses python multiprocessing in order to speed-up a calculation. I'm trying to create a pairwise distance matrix but the implementation with for loops takes more than 8 hours.

This code seems to work faster but when I print the matrix is full of zeros. When I print the rows in the function it seems to work. I think is a scope problem but I cannot understand how to deal with it.

import multiprocessing
import time
import numpy as np

def MultiProcessedFunc(i,x):
    for j in range(i,len(x)):
        time.sleep(0.08)
        M[i,j] = (x[i]+x[j])/2
    print(M[i,:]) # Check if the operation works
    print('')

processes = []

v = [x+1 for x in range(8000)]
M = np.zeros((len(v),len(v)))

for i in range(len(v)):
    p = multiprocessing.Process(target = MultiProcessedFunc, args =(i,v))
    processes.append(p)
    p.start()

for process in processes:
    process.join()
end = time.time()

print('Multiprocessing: {}'.format(end-start))
print(M)


回答1:


Unfortunately your code wont work written in that way. Multiprocessing spawn separate processes, which means that the memory space are separate! Changes made by one subprocess will not be reflected in the other processes or your parent processes.

Strictly speaking this is not a scoping issue. Scope is something defined inside a single interpreter process.

The module does provide means of sharing memory between processes but this comes at a cost (shared memory is way slower due to locking issues and such.

Now, numpy has a nice feature: it releases the GIL during computation. This means that using multi threading instead of multiprocessing should give you some benefit with little other changes to your code, simply replace import multiprocessing with import threading and multiprocessing.Process into threading.Thread. The code should produce the correct result. On my machine, removing the print statements and the sleep code it runs in under 8 seconds:

Multiprocessing: 7.48570203781
[[1.000e+00 1.000e+00 2.000e+00 ... 3.999e+03 4.000e+03 4.000e+03]
 [0.000e+00 2.000e+00 2.000e+00 ... 4.000e+03 4.000e+03 4.001e+03]
 [0.000e+00 0.000e+00 3.000e+00 ... 4.000e+03 4.001e+03 4.001e+03]
 ...
 [0.000e+00 0.000e+00 0.000e+00 ... 7.998e+03 7.998e+03 7.999e+03]
 [0.000e+00 0.000e+00 0.000e+00 ... 0.000e+00 7.999e+03 7.999e+03]
 [0.000e+00 0.000e+00 0.000e+00 ... 0.000e+00 0.000e+00 8.000e+03]]

An alternative is to have your subprocesses return the result and then combine the results in your main process.



来源:https://stackoverflow.com/questions/54698275/how-to-manage-scope-using-multiprocessing

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!