I am trying to rank a pandas data frame based on two columns. I can rank it based on one column, but how can to rank it based on two columns? \'SaleCount\', then \'TotalReve
pd.factorize will generate unique values for each unique element of a iterable. We only need to sort in the order we'd like, then factorize. In order to do multiple columns, we convert the sorted result to tuples.
cols = ['SaleCount', 'TotalRevenue']
tups = df[cols].sort_values(cols, ascending=False).apply(tuple, 1)
f, i = pd.factorize(tups)
factorized = pd.Series(f + 1, tups.index)
df.assign(Rank=factorized)
Date SaleCount TotalRevenue shops Rank
1 2016-12-02 100 9000 S2 1
5 2016-12-02 100 2000 S8 2
3 2016-12-02 35 750 S5 3
2 2016-12-02 30 1000 S1 4
7 2016-12-02 30 600 S7 5
4 2016-12-02 20 500 S4 6
9 2016-12-02 20 500 S10 6
0 2016-12-02 10 300 S3 7
8 2016-12-02 2 50 S9 8
6 2016-12-02 0 0 S6 9