Fast numpy fancy indexing

前端 未结 4 1465
别跟我提以往
别跟我提以往 2020-12-14 04:40

My code for slicing a numpy array (via fancy indexing) is very slow. It is currently a bottleneck in program.

a.shape
(3218, 6)

ts = time.time(); a[rows][:,         


        
4条回答
  •  鱼传尺愫
    2020-12-14 05:19

    To my surprise this, kind of lenghty expression, which calculates first linear 1D-indices, is more than 50% faster than the consecutive array indexing presented in the question:

    (a.ravel()[(
       cols + (rows * a.shape[1]).reshape((-1,1))
       ).ravel()]).reshape(rows.size, cols.size)
    

    UPDATE: OP updated the description of the shape of the initial array. With the updated size the speedup is now above 99%:

    In [93]: a = np.random.randn(3218, 1415)
    
    In [94]: rows = np.random.randint(a.shape[0], size=2000)
    
    In [95]: cols = np.random.randint(a.shape[1], size=6)
    
    In [96]: timeit a[rows][:, cols]
    10 loops, best of 3: 186 ms per loop
    
    In [97]: timeit (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)
    1000 loops, best of 3: 1.56 ms per loop
    

    INITAL ANSWER: Here is the transcript:

    In [79]: a = np.random.randn(3218, 6)
    In [80]: a.shape
    Out[80]: (3218, 6)
    
    In [81]: rows = np.random.randint(a.shape[0], size=2000)
    In [82]: cols = np.array([1,3,4,5])
    

    Time method 1:

    In [83]: timeit a[rows][:, cols]
    1000 loops, best of 3: 1.26 ms per loop
    

    Time method 2:

    In [84]: timeit (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)
    1000 loops, best of 3: 568 us per loop
    

    Check that results are actually the same:

    In [85]: result1 = a[rows][:, cols]
    In [86]: result2 = (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)
    
    In [87]: np.sum(result1 - result2)
    Out[87]: 0.0
    

提交回复
热议问题