Apply reduce on generator output with multiprocessing

爱⌒轻易说出口 提交于 2019-12-05 08:53:57

To answer my own question, I found a solution that seems to work as I had hoped:

First, Mygenerator is no longer a generator but a function. Also, instead of looping through segments of x, y and z, I now pass one segment to the function at the time:

def Myfunction(x_segment, y_segment, z_segment):
        # code that makes two matrices based on input arrays
        return (matrix1, matrix2)

Using multiprocessing.Pool with the imap (generator) function seems to work:

pool = multiprocessing.Pool(ncpus)
results = pool.imap(Myfunction, 
                    ( (x[i], y[i], z[i]) for i in range(len(x)) )
M1, M2 = reduce(lambda r1, r2: (r1[0] + r2[0], r1[1] + r2[1]), 
                    (result for result in results))
pool.close()
pool.join()

where I changed the x and y in the lambda expression to r1 and r2 to avoid confusion with the other variables with the same name. When trying to use a generator with multiprocessing I got some trouble with pickle.

The only disappointment with this solution is that it didn't really speed up the computations that much. I guess that has to do with overhead operations. When using 8 cores, the processing speed was increased by approximately 10%. When reducing to 4 cores the speed was doubled. This seems to be the best I can do with my particular task, unless there is some other way of doing the parallelizing...

The imap function was necessary to use here, since map would store all the returned values in memory before the reduce operation, and in this case that would not be possible.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!