Pandas rank by multiple columns

后端未结

关注

 5  715

小鲜肉 2020-12-16 19:05

I am trying to rank a pandas data frame based on two columns. I can rank it based on one column, but how can to rank it based on two columns? \'SaleCount\', then \'TotalReve

5条回答

萌比男神i (楼主)

2020-12-16 19:18

`sort_values` + `GroupBy.ngroup`

This will give the dense ranking.

Columns should be sorted in the desired order prior to the groupby. Specifying sort=False within the groupby then respects this sorting so that groups are labeled in the order they appear within the sorted DataFrame.

cols = ['SaleCount', 'TotalRevenue']
df['Rank'] = df.sort_values(cols, ascending=False).groupby(cols, sort=False).ngroup() + 1

Output:

print(df.sort_values('Rank'))

   TotalRevenue        Date  SaleCount shops  Rank
1          9000  2016-12-02        100    S2     1
5          2000  2016-12-02        100    S8     2
3           750  2016-12-02         35    S5     3
2          1000  2016-12-02         30    S1     4
7           600  2016-12-02         30    S7     5
4           500  2016-12-02         20    S4     6
9           500  2016-12-02         20   S10     6
0           300  2016-12-02         10    S3     7
8            50  2016-12-02          2    S9     8
6             0  2016-12-02          0    S6     9

0 讨论(0)

查看其它5个回答

Pandas rank by multiple columns

sort_values + GroupBy.ngroup

`sort_values` + `GroupBy.ngroup`