DataFrame sorting based on a function of multiple column values

前端未结

关注

 5  867

Based on python, sort descending dataframe with pandas:

Given:

from pandas import DataFrame
import pandas as pd

d = {\'x\':[2,3,1,4,5],
     \'y\':[


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  没有蜡笔的小新        
                
              
                            
                2020-12-06 05:26
              
            
            
                                                                       
You can create a temporary column to use in sort and then drop it:

df.assign(f = df['one']**2 + df['two']**2).sort_values('f').drop('f', axis=1)
Out: 
  letter  one  two
2      b    1    3
3      b    4    2
1      a    3    4
4      c    5    1
0      a    2    5

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  再見小時候        
                
              
                            
                2020-12-06 05:27
              
            
            
                                                                       
Another approach, similar to this one is to use argsort which returns the indexes permutation directly:
f = lambda r: r.x**2 + r.y**2
df.iloc[df.apply(f, axis=1).argsort()]

I think using argsort better translates the idea than a regular sort (we don't care about the value of this computation, only about the resulting indexes).
It could also be interesting to patch the DataFrame to add this functionality:
def apply_sort(self, *, key):
    return self.iloc[self.apply(key, axis=1).argsort()]

pd.DataFrame.apply_sort = apply_sort

We can then simply write:
>>> df.apply_sort(key=f)

   x  y letter
2  1  3      b
3  4  2      b
1  3  4      a
4  5  1      c
0  2  5      a

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  梦如初夏        
                
              
                            
                2020-12-06 05:29
              
            
            
                                                                       
Have you tried to create a new column and then sorting on that. I cannot comment on the original post, so i am just posting my solution.

df['c'] = df.a**2 + df.b**2
df = df.sort_values('c')

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  没有蜡笔的小新        
                
              
                            
                2020-12-06 05:44
              
            
            
                                                                       
df.iloc[(df.x ** 2 + df.y **2).sort_values().index]


after How to sort pandas dataframe by custom order on string index
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  無奈伤痛        
                
              
                            
                2020-12-06 05:49
              
            
            
                                                                       
from pandas import DataFrame
import pandas as pd

d = {'one':[2,3,1,4,5],
     'two':[5,4,3,2,1],
     'letter':['a','a','b','b','c']}

df = pd.DataFrame(d)

#f = lambda x,y: x**2 + y**2
array = []
for i in range(5):
    array.append(df.ix[i,1]**2 + df.ix[i,2]**2)
array = pd.DataFrame(array, columns = ['Sum of Squares'])
test = pd.concat([df,array],axis = 1, join = 'inner')
test = test.sort_index(by = "Sum of Squares", ascending = True).drop('Sum of Squares',axis =1)


Just realized that you wanted this: 

    letter  one  two
2      b    1    3
3      b    4    2
1      a    3    4
4      c    5    1
0      a    2    5

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复