efficient loop over numpy array

前端未结

关注

 8  1638

轮回少年

Versions of this question have already been asked but I have not found a satisfactory answer.

Problem: given a large numpy vector, find indices of t

相关标签:

8条回答

青春惊慌失措

2021-01-06 02:15

Python itself is a highly-dynamic, slow, language. The idea in numpy is to use vectorization, and avoid explicit loops. In this case, you can use np.equal.outer. You can start with

a = np.equal.outer(vect, vect)

Now, for example, to find the sum:

 >>> np.sum(a)
 10006

To find the indices of i that are equal, you can do

np.fill_diagonal(a, 0)

>>> np.nonzero(np.any(a, axis=0))[0]
array([   1, 2500, 5000])

Timing

def find_vec():
    a = np.equal.outer(vect, vect)
    s = np.sum(a)
    np.fill_diagonal(a, 0)
    return np.sum(a), np.nonzero(np.any(a, axis=0))[0]

>>> %timeit find_vec()
1 loops, best of 3: 214 ms per loop

def find_loop():
    dupl = []
    counter = 0
    for i in range(N):
        for j in range(i+1, N):
             if vect[i] == vect[j]:
                 dupl.append(j)
                 counter += 1
    return dupl

>>> % timeit find_loop()
1 loops, best of 3: 8.51 s per loop

0 讨论(0)

南方客

2021-01-06 02:16
This runs in 8 ms compared to 18 s for your code and doesn't use any strange libraries. It's similar to the approach by @vs0, but I like defaultdict more. It should be approximately O(N).
```
from collections import defaultdict
dupl = []
counter = 0
indexes = defaultdict(list)
for i, e in enumerate(vect):
    indexes[e].append(i)
    if len(indexes[e]) > 1:
        dupl.append(i)
        counter += 1
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
长发绾君心

2021-01-06 02:21
The obvious question is why you want to do this in this way. NumPy arrays are intended to be opaque data structures – by this I mean NumPy arrays are intended to be created inside the NumPy system and then operations sent in to the NumPy subsystem to deliver a result. i.e. NumPy should be a black box into which you throw requests and out come results.

So given the code above I am not at all suprised that NumPy performance is worse than dreadful.

The following should be effectively what you want, I believe, but done the NumPy way:
```
import numpy as np

N = 10000
vect = np.arange(float(N))
vect[N/2] = 1
vect[N/4] = 1

print([np.where(a == vect)[0] for a in vect][1])

# Delivers [1, 2500, 5000]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

遥遥无期

2021-01-06 02:22

As an alternative to Ami Tavory's answer, you can use a Counter from the collections package to detect duplicates. On my computer it seems to be even faster. See the function below which can also find different duplicates.

import collections
import numpy as np

def find_duplicates_original(x):
    d = []
    for i in range(len(x)):
        for j in range(i + 1, len(x)):
            if x[i] == x[j]:
                d.append(j)
    return d

def find_duplicates_outer(x):
    a = np.equal.outer(x, x)
    np.fill_diagonal(a, 0)
    return np.flatnonzero(np.any(a, axis=0))

def find_duplicates_counter(x):
    counter = collections.Counter(x)
    values = (v for v, c in counter.items() if c > 1)
    return {v: np.flatnonzero(x == v) for v in values}


n = 10000
x = np.arange(float(n))
x[n // 2] = 1
x[n // 4] = 1

>>>> find_duplicates_counter(x)
{1.0: array([   1, 2500, 5000], dtype=int64)}

>>>> %timeit find_duplicates_original(x)
1 loop, best of 3: 12 s per loop

>>>> %timeit find_duplicates_outer(x)
10 loops, best of 3: 84.3 ms per loop

>>>> %timeit find_duplicates_counter(x)
1000 loops, best of 3: 1.63 ms per loop

0 讨论(0)

感情败类

2021-01-06 02:23

Since the answers have stopped coming and none was totally satisfactory, for the record I post my own solution.

It is my understanding that it's the assignment which makes Python slow in this case, not the nested loops as I thought initially. Using a library or compiled code eliminates the need for assignments and performance improves dramatically.

from __future__ import print_function
import numpy as np
from numba import jit

N = 10000
vect = np.arange(N, dtype=np.float32)

vect[N/2] = 1
vect[N/4] = 1
dupl = np.zeros(N, dtype=np.int32)

print("init done")
# uncomment to enable compiled function
#@jit
def duplicates(i, counter, dupl, vect):
    eps = 0.01
    ns = len(vect)
    for j in range(i+1, ns):
        # replace if to use approx comparison
        #if abs(vect[i] - vect[j]) < eps:
        if vect[i] == vect[j]:
            dupl[counter] = j
            counter += 1
    return counter

counter = 0
for i in xrange(N):
    counter = duplicates(i, counter, dupl, vect)

print("counter =", counter)
print(dupl[0:counter])

Tests

# no jit
$ time python array-test-numba.py
init done
counter = 3
[2500 5000 5000]

elapsed 10.135 s

# with jit
$ time python array-test-numba.py
init done
counter = 3
[2500 5000 5000]

elapsed 0.480 s

The performance of compiled version (with @jit uncommented) is close to C code performance ~0.1 - 0.2 sec. Perhaps eliminating the last loop could improve the performance even further. The difference in performance is even stronger when using approximate comparison using eps while there is very little difference for the compiled version.

# no jit
$ time python array-test-numba.py
init done
counter = 3
[2500 5000 5000]

elapsed 109.218 s

# with jit
$ time python array-test-numba.py
init done
counter = 3
[2500 5000 5000]

elapsed 0.506 s

This is ~ 200x difference. In the real code, I had to put both loops in the function as well as use a function template with variable types so it was a bit more complex but not very much.

0 讨论(0)

无人及你

2021-01-06 02:29
This solution using the numpy_indexed package has complexity n Log n, and is fully vectorized; so not terribly different from C performance, in all likelihood.
```
import numpy_indexed as npi
dpl = np.flatnonzero(npi.multiplicity(vect) > 1)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页