efficient loop over numpy array

前端未结

关注

 8  1645

轮回少年 2021-01-06 02:05

Versions of this question have already been asked but I have not found a satisfactory answer.

Problem: given a large numpy vector, find indices of t

8条回答

青春惊慌失措 (楼主)

2021-01-06 02:15

Python itself is a highly-dynamic, slow, language. The idea in numpy is to use vectorization, and avoid explicit loops. In this case, you can use np.equal.outer. You can start with

a = np.equal.outer(vect, vect)

Now, for example, to find the sum:

 >>> np.sum(a)
 10006

To find the indices of i that are equal, you can do

np.fill_diagonal(a, 0)

>>> np.nonzero(np.any(a, axis=0))[0]
array([   1, 2500, 5000])

Timing

def find_vec():
    a = np.equal.outer(vect, vect)
    s = np.sum(a)
    np.fill_diagonal(a, 0)
    return np.sum(a), np.nonzero(np.any(a, axis=0))[0]

>>> %timeit find_vec()
1 loops, best of 3: 214 ms per loop

def find_loop():
    dupl = []
    counter = 0
    for i in range(N):
        for j in range(i+1, N):
             if vect[i] == vect[j]:
                 dupl.append(j)
                 counter += 1
    return dupl

>>> % timeit find_loop()
1 loops, best of 3: 8.51 s per loop

0 讨论(0)

查看其它8个回答