efficient loop over numpy array

前端 未结 8 1625
轮回少年
轮回少年 2021-01-06 02:05

Versions of this question have already been asked but I have not found a satisfactory answer.

Problem: given a large numpy vector, find indices of t

8条回答
  •  感情败类
    2021-01-06 02:23

    Since the answers have stopped coming and none was totally satisfactory, for the record I post my own solution.

    It is my understanding that it's the assignment which makes Python slow in this case, not the nested loops as I thought initially. Using a library or compiled code eliminates the need for assignments and performance improves dramatically.

    from __future__ import print_function
    import numpy as np
    from numba import jit
    
    N = 10000
    vect = np.arange(N, dtype=np.float32)
    
    vect[N/2] = 1
    vect[N/4] = 1
    dupl = np.zeros(N, dtype=np.int32)
    
    print("init done")
    # uncomment to enable compiled function
    #@jit
    def duplicates(i, counter, dupl, vect):
        eps = 0.01
        ns = len(vect)
        for j in range(i+1, ns):
            # replace if to use approx comparison
            #if abs(vect[i] - vect[j]) < eps:
            if vect[i] == vect[j]:
                dupl[counter] = j
                counter += 1
        return counter
    
    counter = 0
    for i in xrange(N):
        counter = duplicates(i, counter, dupl, vect)
    
    print("counter =", counter)
    print(dupl[0:counter])
    

    Tests

    # no jit
    $ time python array-test-numba.py
    init done
    counter = 3
    [2500 5000 5000]
    
    elapsed 10.135 s
    
    # with jit
    $ time python array-test-numba.py
    init done
    counter = 3
    [2500 5000 5000]
    
    elapsed 0.480 s
    

    The performance of compiled version (with @jit uncommented) is close to C code performance ~0.1 - 0.2 sec. Perhaps eliminating the last loop could improve the performance even further. The difference in performance is even stronger when using approximate comparison using eps while there is very little difference for the compiled version.

    # no jit
    $ time python array-test-numba.py
    init done
    counter = 3
    [2500 5000 5000]
    
    elapsed 109.218 s
    
    # with jit
    $ time python array-test-numba.py
    init done
    counter = 3
    [2500 5000 5000]
    
    elapsed 0.506 s
    

    This is ~ 200x difference. In the real code, I had to put both loops in the function as well as use a function template with variable types so it was a bit more complex but not very much.

提交回复
热议问题