Get indices of top N values in 2D numpy ndarray or numpy matrix

后端 未结 2 822
南旧
南旧 2020-12-21 03:59

I have an array of N-dimensional vectors.

data = np.array([[5, 6, 1], [2, 0, 8], [4, 9, 3]])

In [1]: data
Out[1]:
array([[5, 6, 1],
             


        
相关标签:
2条回答
  • 2020-12-21 04:35

    As a slight improvement over the otherwise very good answer by DSM, instead of using np.argsort(), it is more efficient to use np.argpartition() if the order of the N greatest is of no consequence.

    Partitioning an array arr with index i rearranges the elements such that the element at index i is the ith greatest, while those on the left are greater and on the right are lesser. The partitions on the left and right are not necessarily sorted. This has the advantage that it runs in linear time.

    0 讨论(0)
  • 2020-12-21 04:47

    I'd ravel, argsort, and then unravel. I'm not claiming this is the best way, only that it's the first way that occurred to me, and I'll probably delete it in shame after someone posts something more obvious. :-)

    That said (choosing the top 2 values, arbitrarily):

    In [73]: dists = sklearn.metrics.pairwise_distances(data)
    
    In [74]: dists[np.tril_indices_from(dists, -1)] = 0
    
    In [75]: dists
    Out[75]: 
    array([[  0.        ,   9.69535971,   3.74165739],
           [  0.        ,   0.        ,  10.48808848],
           [  0.        ,   0.        ,   0.        ]])
    
    In [76]: ii = np.unravel_index(np.argsort(dists.ravel())[-2:], dists.shape)
    
    In [77]: ii
    Out[77]: (array([0, 1]), array([1, 2]))
    
    In [78]: dists[ii]
    Out[78]: array([  9.69535971,  10.48808848])
    
    0 讨论(0)
提交回复
热议问题