Get N maximum values and indices along an axis in a NumPy array

后端 未结 3 1769
谎友^
谎友^ 2020-12-11 10:44

I think this is an easy question for experienced numpy users.

I have a score matrix. The raw index corresponds to samples and column index corresponds to items. For

相关标签:
3条回答
  • 2020-12-11 11:16

    Here's an approach using np.argpartition -

    idx = np.argpartition(a,range(M))[:,:-M-1:-1] # topM_ind
    out = a[np.arange(a.shape[0])[:,None],idx]    # topM_score
    

    Sample run -

    In [343]: a
    Out[343]: 
    array([[ 1. ,  0.3,  0.4],
           [ 0.2,  0.6,  0.8],
           [ 0.1,  0.3,  0.5]])
    
    In [344]: M = 2
    
    In [345]: idx = np.argpartition(a,range(M))[:,:-M-1:-1]
    
    In [346]: idx
    Out[346]: 
    array([[0, 2],
           [2, 1],
           [2, 1]])
    
    In [347]: a[np.arange(a.shape[0])[:,None],idx]
    Out[347]: 
    array([[ 1. ,  0.4],
           [ 0.8,  0.6],
           [ 0.5,  0.3]])
    

    Alternatively, possibly slower, but a bit shorter code to get idx would be with np.argsort -

    idx = a.argsort(1)[:,:-M-1:-1]
    

    Here's a post containing some runtime test that compares np.argsort and np.argpartition on a similar problem.

    0 讨论(0)
  • 2020-12-11 11:19

    In case someone is interested in the both the values and corresponding indices without tempering with the order, the following simple approach will be helpful. Though it could be computationally expensive if working with large data since we are using a list to store tuples of value, index.

    import numpy as np
    values = np.array([0.01,0.6, 0.4, 0.0, 0.1,0.7, 0.12]) # a simple array
    values_indices = [] # define an empty list to store values and indices
    while values.shape[0]>1:
        values_indices.append((values.max(), values.argmax()))
        # remove the maximum value from the array:
        values = np.delete(values, values.argmax())
    

    The final output as list of tuples:

    values_indices
    [(0.7, 5), (0.6, 1), (0.4, 1), (0.12, 3), (0.1, 2), (0.01, 0)]
    
    0 讨论(0)
  • 2020-12-11 11:28

    I'd use argsort():

    top2_ind = score_matrix.argsort()[:,::-1][:,:2]
    

    That is, produce an array which contains the indices which would sort score_matrix:

    array([[1, 2, 0],
           [0, 1, 2],
           [0, 1, 2]])
    

    Then reverse the columns with ::-1, then take the first two columns with :2:

    array([[0, 2],
           [2, 1],
           [2, 1]])
    

    Then similar but with regular np.sort() to get the values:

    top2_score = np.sort(score_matrix)[:,::-1][:,:2]
    

    Which following the same mechanics as above, gives you:

    array([[ 1. ,  0.4],
           [ 0.8,  0.6],
           [ 0.5,  0.3]])
    
    0 讨论(0)
提交回复
热议问题