Efficient way to normalize a Scipy Sparse Matrix

后端未结

关注

 5  1752

孤独总比滥情好 2020-12-29 20:49

I\'d like to write a function that normalizes the rows of a large sparse matrix (such that they sum to one).

from pylab import *
import scipy.sparse as sp

d


      
      
        
          5条回答        

        
                    
            
            
                         
                
              
              
                
                   暖寄归人
                                             
                
                
                (楼主)
            
              
              
                2020-12-29 21:53
              

            
            
                        
Without importing sklearn, converting to dense or multiplying matrices and by exploiting the data representation of csr matrices:

from scipy.sparse import isspmatrix_csr

def normalize(W):
    """ row normalize scipy sparse csr matrices inplace.
    """
    if not isspmatrix_csr(W):
        raise ValueError('W must be in CSR format.')
    else:
        for i in range(W.shape[0]):
            row_sum = W.data[W.indptr[i]:W.indptr[i+1]].sum()
            if row_sum != 0:
                W.data[W.indptr[i]:W.indptr[i+1]] /= row_sum


Remember that W.indices is the array of column indices,
W.data is the array of corresponding nonzero values
and W.indptr points to row starts in indices and data.

You can add a numpy.abs() when taking the sum if you need the L1 norm or use numpy.max() to normalize by the maximum value per row.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它5个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复