Pandas rank by multiple columns

后端 未结 5 696
小鲜肉
小鲜肉 2020-12-16 19:05

I am trying to rank a pandas data frame based on two columns. I can rank it based on one column, but how can to rank it based on two columns? \'SaleCount\', then \'TotalReve

5条回答
  •  萌比男神i
    2020-12-16 19:18

    sort_values + GroupBy.ngroup

    This will give the dense ranking.

    Columns should be sorted in the desired order prior to the groupby. Specifying sort=False within the groupby then respects this sorting so that groups are labeled in the order they appear within the sorted DataFrame.

    cols = ['SaleCount', 'TotalRevenue']
    df['Rank'] = df.sort_values(cols, ascending=False).groupby(cols, sort=False).ngroup() + 1
    

    Output:

    print(df.sort_values('Rank'))
    
       TotalRevenue        Date  SaleCount shops  Rank
    1          9000  2016-12-02        100    S2     1
    5          2000  2016-12-02        100    S8     2
    3           750  2016-12-02         35    S5     3
    2          1000  2016-12-02         30    S1     4
    7           600  2016-12-02         30    S7     5
    4           500  2016-12-02         20    S4     6
    9           500  2016-12-02         20   S10     6
    0           300  2016-12-02         10    S3     7
    8            50  2016-12-02          2    S9     8
    6             0  2016-12-02          0    S6     9
    

提交回复
热议问题