Get indices of top N values in 2D numpy ndarray or numpy matrix

后端未结

关注

 2  822

I have an array of N-dimensional vectors.

data = np.array([[5, 6, 1], [2, 0, 8], [4, 9, 3]])

In [1]: data
Out[1]:
array([[5, 6, 1],


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  暗喜        
                
              
                            
                2020-12-21 04:35
              
            
            
                                                                       
As a slight improvement over the otherwise very good answer by DSM, instead of using np.argsort(), it is more efficient to use np.argpartition() if the order of the N greatest is of no consequence. 

Partitioning an array arr with index i rearranges the elements such that the element at index i is  the ith greatest, while those on the left are greater and on the right are lesser. The partitions on the left and right are not necessarily sorted. This has the advantage that it runs in linear time. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  故里飘歌        
                
              
                            
                2020-12-21 04:47
              
            
            
                                                                       
I'd ravel, argsort, and then unravel.  I'm not claiming this is the best way, only that it's the first way that occurred to me, and I'll probably delete it in shame after someone posts something more obvious. :-)

That said (choosing the top 2 values, arbitrarily):

In [73]: dists = sklearn.metrics.pairwise_distances(data)

In [74]: dists[np.tril_indices_from(dists, -1)] = 0

In [75]: dists
Out[75]: 
array([[  0.        ,   9.69535971,   3.74165739],
       [  0.        ,   0.        ,  10.48808848],
       [  0.        ,   0.        ,   0.        ]])

In [76]: ii = np.unravel_index(np.argsort(dists.ravel())[-2:], dists.shape)

In [77]: ii
Out[77]: (array([0, 1]), array([1, 2]))

In [78]: dists[ii]
Out[78]: array([  9.69535971,  10.48808848])

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复