Parallelize these nested for loops in python

▼魔方 西西 提交于 2019-12-01 04:46:52

If you want a more general solution, taking advantage of fully parallel execution, then why not use something like this:

>>> import multiprocess as mp
>>> p = mp.Pool()
>>> 
>>> # a time consuming function taking x,y,z,...
>>> def fun(*args):
...   import time
...   time.sleep(.1)
...   return sum(*args)
... 
>>> dim1, dim2, dim3 = 10, 20, 30
>>> import itertools
>>> input = ((i,j,k) for i,j,k in itertools.combinations_with_replacement(xrange(dim3), 3) if i < dim1 and j < dim2)
>>> results = p.map(fun, input)
>>> p.close()
>>> p.join()
>>>
>>> results[:2]
[0, 1]
>>> results[-2:]
[56, 57]

Note I'm using multiprocess instead of multiprocessing, but that's only to get the ability to work in the interpreter.

I didn't use a numpy.array, but if you had to... you could just dump the output from p.map directly into a numpy.array and then modify the shape attribute to be shape = (dim1, dim2, dim3), or you could do something like this:

>>> input = ((i,j,k) for i,j,k in itertools.combinations_with_replacement(xrange(dim3), 3) if i < dim1 and j < dim2)
>>> import numpy as np
>>> results = np.empty(dim1*dim2*dim3)
>>> res = p.imap(fun, input)
>>> for i,r in enumerate(res):
...   results[i] = r
... 
>>> results.shape = (dim1,dim2,dim3)

Here is a version of code that runs fun(i, j, k) in parallel for differend k indices. This is done by running fun in different processes by using https://docs.python.org/2/library/multiprocessing.html

import numpy as np
from multiprocessing import Pool


def fun(x, y, z):
    # time-consuming computation...
    # ...

    return output


def fun_wrapper(indices):
    fun(*indices)

if __name__ == '__main__':
    dim1 = 10
    dim2 = 20
    dim3 = 30

    result = np.zeros([dim1, dim2, dim3])

    pool = Pool(processes=8)
    for i in xrange(dim1):
        for j in xrange(dim2):
            result[i, j] = pool.map(fun_wrapper, [(i, j, k) for k in xrange(dim3)])

This is not the most elegant solution but you may start with it. And you will get a speed up only if fun contains time-consuming computation

A simple approach could be to divide the array in sections and create some threads to operate throught these sections. For example one section from (0,0,0) to (5,10,15) and other one from (5,10,16) to (10,20,30).

You can use threading module and do something like this

import numpy as np
import threading as t


def fun(x, y, z):
    # time-consuming computation...
    # ...

    return output


dim1 = 10
dim2 = 20
dim3 = 30

result = np.zeros([dim1, dim2, dim3])
#b - beginning index, e - end index
def work(ib,jb,kb,ie,je,ke):
    for i in xrange(ib,ie):
        for j in xrange(jb,je):
            for k in xrange(kb,ke):
                result[i, j, k] = fun(i, j, k)

 threads = list()
 threads.append(t.Thread(target=work, args(0,0,0,dim1/2,dim2/2,dim3/2))
 threads.append(t.Thread(target=work, args(dim1/2,dim2/2,dim3/2 +1,dim1, dim2, dim3))

 for thread in threads:
     thread.start()

You can define these sections through some algorithm and determine the number of threads dynamically. Hope it helps you or at least give you some ideas.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!