efficient loop over numpy array

前端未结

关注

 8  1639

轮回少年

Versions of this question have already been asked but I have not found a satisfactory answer.

Problem: given a large numpy vector, find indices of t

相关标签:

8条回答

梦如初夏

2021-01-06 02:33
Approach #1

You can simulate that iterator dependency criteria for a vectorized solution using a triangular matrix. This is based on this post that dealt with multiplication involving iterator dependency. For performing the elementwise equality of each element in vect against its all elements, we can use NumPy broadcasting. Finally, we can use np.count_nonzero to get the count, as it's supposed to be very efficient in summing purposes on boolean arrays.

So, we would have a solution like so -
```
mask = np.triu(vect[:,None] == vect,1)
counter = np.count_nonzero(mask)
dupl = np.where(mask)[1]
```
If you only care about the count counter, we could have two more approaches as listed next.

Approach #2

We can avoid the use of the triangular matrix and simply get the entire count and just subtract the contribution from diagonal elements and consider just one of either lower of upper triangular regions by just halving the remaining count as the contributions from either ones would be identical.

So, we would have a modified solution like so -
```
counter = (np.count_nonzero(vect[:,None] == vect) - vect.size)//2
```
Approach #3

Here's an entirely different approach that uses the fact the count of each unique element plays a cumsumed contribution to the final total.

So, with that idea in mind, we would have a third approach like so -
```
count = np.bincount(vect) # OR np.unique(vect,return_counts=True)[1]
idx = count[count>1]
id_arr = np.ones(idx.sum(),dtype=int)
id_arr[0] = 0
id_arr[idx[:-1].cumsum()] = -idx[:-1]+1
counter = np.sum(id_arr.cumsum())
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

忘了有多久

2021-01-06 02:41

I wonder why whatever I tried Python is 100x or more slower than an equivalent C code.

Because Python programs are usually 100x slower than C programs.

You can either implement critical code paths in C and provide Python-C bindings, or change the algorithm. You can write an O(N) version by using a dict that reverses the array from value to index.

import numpy as np
N = 10000
vect = np.arange(float(N))
vect[N/2] = 1
vect[N/4] = 1
dupl = {}
print("init done")
counter = 0
for i in range(N):
    e = dupl.get(vect[i], None)
    if e is None:
        dupl[vect[i]] = [i]
    else:
        e.append(i)
        counter += 1

print("counter =", counter)
print([(k, v) for k, v in dupl.items() if len(v) > 1])

Edit:

If you need to test against an eps with abs(vect[i] - vect[j]) < eps you can then normalize the values up to eps

abs(vect[i] - vect[j]) < eps ->
abs(vect[i] - vect[j]) / eps < (eps / eps) ->
abs(vect[i]/eps - vect[j]/eps) < 1
int(abs(vect[i]/eps - vect[j]/eps)) = 0

Like this:

import numpy as np
N = 10000
vect = np.arange(float(N))
vect[N/2] = 1
vect[N/4] = 1
dupl = {}
print("init done")
counter = 0
eps = 0.01
for i in range(N):
    k = int(vect[i] / eps)
    e = dupl.get(k, None)
    if e is None:
        dupl[k] = [i]
    else:
        e.append(i)
        counter += 1

print("counter =", counter)
print([(k, v) for k, v in dupl.items() if len(v) > 1])

0 讨论(0)

上一页 1 2