DataFrame sorting based on a function of multiple column values

前端 未结 5 853
我在风中等你
我在风中等你 2020-12-06 05:11

Based on python, sort descending dataframe with pandas:

Given:

from pandas import DataFrame
import pandas as pd

d = {\'x\':[2,3,1,4,5],
     \'y\':[         


        
相关标签:
5条回答
  • 2020-12-06 05:26

    You can create a temporary column to use in sort and then drop it:

    df.assign(f = df['one']**2 + df['two']**2).sort_values('f').drop('f', axis=1)
    Out: 
      letter  one  two
    2      b    1    3
    3      b    4    2
    1      a    3    4
    4      c    5    1
    0      a    2    5
    
    0 讨论(0)
  • 2020-12-06 05:27

    Another approach, similar to this one is to use argsort which returns the indexes permutation directly:

    f = lambda r: r.x**2 + r.y**2
    df.iloc[df.apply(f, axis=1).argsort()]
    

    I think using argsort better translates the idea than a regular sort (we don't care about the value of this computation, only about the resulting indexes).

    It could also be interesting to patch the DataFrame to add this functionality:

    def apply_sort(self, *, key):
        return self.iloc[self.apply(key, axis=1).argsort()]
    
    pd.DataFrame.apply_sort = apply_sort
    

    We can then simply write:

    >>> df.apply_sort(key=f)
    
       x  y letter
    2  1  3      b
    3  4  2      b
    1  3  4      a
    4  5  1      c
    0  2  5      a
    
    0 讨论(0)
  • 2020-12-06 05:29

    Have you tried to create a new column and then sorting on that. I cannot comment on the original post, so i am just posting my solution.

    df['c'] = df.a**2 + df.b**2
    df = df.sort_values('c')
    
    0 讨论(0)
  • 2020-12-06 05:44
    df.iloc[(df.x ** 2 + df.y **2).sort_values().index]
    

    after How to sort pandas dataframe by custom order on string index

    0 讨论(0)
  • 2020-12-06 05:49
    from pandas import DataFrame
    import pandas as pd
    
    d = {'one':[2,3,1,4,5],
         'two':[5,4,3,2,1],
         'letter':['a','a','b','b','c']}
    
    df = pd.DataFrame(d)
    
    #f = lambda x,y: x**2 + y**2
    array = []
    for i in range(5):
        array.append(df.ix[i,1]**2 + df.ix[i,2]**2)
    array = pd.DataFrame(array, columns = ['Sum of Squares'])
    test = pd.concat([df,array],axis = 1, join = 'inner')
    test = test.sort_index(by = "Sum of Squares", ascending = True).drop('Sum of Squares',axis =1)
    

    Just realized that you wanted this:

        letter  one  two
    2      b    1    3
    3      b    4    2
    1      a    3    4
    4      c    5    1
    0      a    2    5
    
    0 讨论(0)
提交回复
热议问题