How to sort pandas data frame using values from several columns?

前端 未结 7 1161
走了就别回头了
走了就别回头了 2020-12-02 09:33

I have the following data frame:

df = pandas.DataFrame([{\'c1\':3,\'c2\':10},{\'c1\':2, \'c2\':30},{\'c1\':1,\'c2\':20},{\'c1\':2,\'c2\':15},{\'c1\':2,\'c2\'         


        
7条回答
  •  眼角桃花
    2020-12-02 10:29

    I have found this to be really useful:

    df = pd.DataFrame({'A' : range(0,10) * 2, 'B' : np.random.randint(20,30,20)})
    
    # A ascending, B descending
    df.sort(**skw(columns=['A','-B']))
    
    # A descending, B ascending
    df.sort(**skw(columns=['-A','+B']))
    

    Note that unlike the standard columns=,ascending= arguments, here column names and their sort order are in the same place. As a result your code gets a lot easier to read and maintain.

    Note the actual call to .sort is unchanged, skw (sortkwargs) is just a small helper function that parses the columns and returns the usual columns= and ascending= parameters for you. Pass it any other sort kwargs as you usually would. Copy/paste the following code into e.g. your local utils.py then forget about it and just use it as above.

    # utils.py (or anywhere else convenient to import)
    def skw(columns=None, **kwargs):
        """ get sort kwargs by parsing sort order given in column name """
        # set default order as ascending (+)
        sort_cols = ['+' + col if col[0] != '-' else col for col in columns]
        # get sort kwargs
        columns, ascending = zip(*[(col.replace('+', '').replace('-', ''), 
                                    False if col[0] == '-' else True) 
                                   for col in sort_cols])
        kwargs.update(dict(columns=list(columns), ascending=ascending))
        return kwargs
    

提交回复
热议问题