What is the best way to apply a function over the index of a Pandas DataFrame
?
Currently I am using this verbose approach:
pd.DataFrame({\"Month\":
A lot of answers are returning the Index as an array, which loses information about the index name etc (though you could do pd.Series(index.map(myfunc), name=index.name)
). It also won't work for a MultiIndex.
The way that I worked with this is to use "rename":
mix = pd.MultiIndex.from_tuples([[1, 'hi'], [2, 'there'], [3, 'dude']], names=['num', 'name'])
data = np.random.randn(3)
df = pd.Series(data, index=mix)
print(df)
num name
1 hi 1.249914
2 there -0.414358
3 dude 0.987852
dtype: float64
# Define a few dictionaries to denote the mapping
rename_dict = {i: i*100 for i in df.index.get_level_values('num')}
rename_dict.update({i: i+'_yeah!' for i in df.index.get_level_values('name')})
df = df.rename(index=rename_dict)
print(df)
num name
100 hi_yeah! 1.249914
200 there_yeah! -0.414358
300 dude_yeah! 0.987852
dtype: float64
The only trick with this is that your index needs to have unique labels b/w different multiindex levels, but maybe someone more clever than me knows how to get around that. For my purposes this works 95% of the time.