I would like to create a rank on year (so in year 2012, Manager B is 1. In 2011, Manager B is 1 again). I struggled with the pandas rank function for awhile and DO NOT want
It sounds like you want to group by the Year, then rank the Returns in descending order.
import pandas as pd
s = pd.DataFrame([['2012', 'A', 3], ['2012', 'B', 8], ['2011', 'A', 20], ['2011', 'B', 30]],
columns=['Year', 'Manager', 'Return'])
s['Rank'] = s.groupby(['Year'])['Return'].rank(ascending=False)
print(s)
yields
Year Manager Return Rank
0 2012 A 3 2
1 2012 B 8 1
2 2011 A 20 2
3 2011 B 30 1
To address the OP's revised question: The error message
ValueError: cannot reindex from a duplicate axis
occurs when trying to groupby/rank on a DataFrame with duplicate values in the index. You can avoid the problem by constructing s to have unique index values after appending:
s = pd.DataFrame([['2012', 'A', 3], ['2012', 'B', 8], ['2011', 'A', 20], ['2011', 'B', 30]], columns=['Year', 'Manager', 'Return'])
b = pd.DataFrame([['2012', 'A', 3], ['2012', 'B', 8], ['2011', 'A', 20], ['2011', 'B', 30]], columns=['Year', 'Manager', 'Return'])
s = s.append(b, ignore_index=True)
yields
Year Manager Return
0 2012 A 3
1 2012 B 8
2 2011 A 20
3 2011 B 30
4 2012 A 3
5 2012 B 8
6 2011 A 20
7 2011 B 30
If you've already appended new rows using
s = s.append(b)
then use reset_index to create a unique index:
s = s.reset_index(drop=True)