How to sort a numpy array with key as isnan?

前端未结

关注

 3  1700

I have a numpy array like

np.array([[1.0, np.nan, 5.0, 1, True, True, np.nan, True],
       [np.nan, 4.0, 7.0, 2, True, np.nan, False, True],
       [2.0, 5


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  醉话见心        
                
              
                            
                2020-12-21 17:03
              
            
            
                                                                       
You can't do this with an object array and nan You would need to find a numeric type everything would fit into. When used as an object instead of as a float, nan returns false for <, >, and ==.

Additionally, True and False are equivalent to 0 and 1, so I don't think there is any way to get your expected result.

You would have to see if converting the dtype to float would give you proper results for your use case.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  既然无缘        
                
              
                            
                2020-12-21 17:12
              
            
            
                                                                       
Approach #1

Here's a vectorized approach borrowing the concept of masking from this post -

def mask_app(a):
    out = np.empty_like(a)
    mask = np.isnan(a.astype(float))
    mask_sorted = np.sort(mask,1)
    out[mask_sorted] = a[mask]
    out[~mask_sorted] = a[~mask]
    return out


Sample run -

# Input dataframe
In [114]: data
Out[114]: 
   ID_1  ID_2  ID_3  Key    Var  Var_1  Var_2 Var_3
0   1.0   NaN   5.0    1   True   True    NaN  True
1   NaN   4.0   7.0    2   True    NaN  False  True
2   2.0   5.0   NaN    3  False  False   True   NaN

# Use pandas approach for verification    
In [115]: data.apply(lambda x : sorted(x,key=pd.isnull),1).values
Out[115]: 
array([[1.0, 5.0, 1, True, True, True, nan, nan],
       [4.0, 7.0, 2, True, False, True, nan, nan],
       [2.0, 5.0, 3, False, False, True, nan, nan]], dtype=object)

# Use proposed approach and verify
In [116]: mask_app(data.values)
Out[116]: 
array([[1.0, 5.0, 1, True, True, True, nan, nan],
       [4.0, 7.0, 2, True, False, True, nan, nan],
       [2.0, 5.0, 3, False, False, True, nan, nan]], dtype=object)


Approach #2

With few more modifications, a simplified version with the idea from this post -

def mask_app2(a):
    out = np.full(a.shape,np.nan,dtype=a.dtype)
    mask = ~np.isnan(a.astype(float))
    out[np.sort(mask,1)[:,::-1]] = a[mask]
    return out

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  再見小時候        
                
              
                            
                2020-12-21 17:12
              
            
            
                                                                       
Since you have an object array anyway, do the sorting in Python, then make your array. You can write a key that does something like this:

from math import isnan

def key(x):
    if isnan(x):
        t = 3
        x = 0
    elif isinstance(x, bool):
        t = 2
    else:
        t = 1
    return t, x


This key returns a two-element tuple, where the first element gives the preliminary ordering by type. It considers all NaNs to be equal and greater than any other type.

Even if you start with data in a DataFrame, you can do something like:

values = [list(sorted(row, key=key)) for row in data.values]
values = np.array(values, dtype=np.object)


You can replace the list comprehension with np.apply_along_axis if that suits your needs better:

values = np.apply_along_axis(lambda row: np.array(list(sorted(row, key=key))),
                             axis=1, arr=data.values)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复