Numpy with Combinatoric generators: How does one speed up Combinations?

五迷三道 提交于 2019-12-12 21:54:34

问题


It is my understanding that the itertools functions are written in C. If i wanted to speed this example code up:

import numpy as np
from itertools import combinations_with_replacement

def combinatorics(LargeArray):
     newArray = np.empty((LargeArray.shape[0],LargeArray.shape[0]))
     for x, y in combinations_with_replacement(xrange(LargeArray.shape[0]), r=2):
         z = LargeArray[x] + LargeArray[y]
         newArray[x, y] = z
     return newArray

Since combinations_with_replacement is written in C, does that imply that it can't be sped up? Please advise.

Thanks in advance.


回答1:


It's true that combinations_with_replacement is written in C, which means that you're not likely to speed up the implementation of that part of the code. But most of your code isn't spent on finding the combinations: it's on the for loop that does the additions. You really, really, really want to avoid that kind of loop if at all possible when you're using numpy. This version will do almost the same thing, through the magic of broadcasting:

def sums(large_array):
    return large_array.reshape((-1, 1)) + large_array.reshape((1, -1))

For example:

>>> ary = np.arange(5).astype(float)
>>> np.triu(combinatorics(ary))
array([[ 0.,  1.,  2.,  3.,  4.],
       [ 0.,  2.,  3.,  4.,  5.],
       [ 0.,  0.,  4.,  5.,  6.],
       [ 0.,  0.,  0.,  6.,  7.],
       [ 0.,  0.,  0.,  0.,  8.]])
>>> np.triu(sums(ary))
array([[ 0.,  1.,  2.,  3.,  4.],
       [ 0.,  2.,  3.,  4.,  5.],
       [ 0.,  0.,  4.,  5.,  6.],
       [ 0.,  0.,  0.,  6.,  7.],
       [ 0.,  0.,  0.,  0.,  8.]])

The difference is that combinatorics leaves the lower triangle as random gibberish, where sums makes the matrix symmetric. If you really wanted to avoid adding everything twice, you probably could, but I can't think of how to do it off the top of my head.

Oh, and the other difference:

>>> big_ary = np.random.random(1000)
>>> %timeit combinatorics(big_ary)
1 loops, best of 3: 482 ms per loop
>>> %timeit sums(big_ary)
1000 loops, best of 3: 1.7 ms per loop


来源:https://stackoverflow.com/questions/14472362/numpy-with-combinatoric-generators-how-does-one-speed-up-combinations

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!