问题
A small dataframe with a two level multiindex and one column. The second column(level 1) of the index will sort in alphabetical order putting 'Four' before 'Three'.
import pandas as pd
df = pd.DataFrame({'A':[1,1,2,2],
'B':['One','Two','Three', 'Four'],
'X':[1,2,3,4]},
index=range(4)).set_index(['A','B']).sort_index()
df
X
A B
1 One 1
Two 2
2 Four 4
Three 3
Clearly the second level of the index (B) is in alphabetical order so this can be replaced with a categorical index to force the correct ordering.
df.index.set_levels(pd.CategoricalIndex(df.index.levels[1],
categories=['One','Two','Three', 'Four'], ordered=True),
level=1, inplace=True)
With this done inspecting the index shows that level 1 is indeed a categorical index. But sorting the index does not put the rows in the desired order.
df.sort_index()
X
A B
1 One 1
Two 2
2 Four 4
Three 3
Note: If the the dataframe has a simple index of 1 level only this works as expected.
回答1:
I managed to get this by setting the index after the dataframe has been created - not sure if this is the best answer but it's an answer:
df = pd.DataFrame({'A':[1,1,2,2],
'B':['One','Two','Three', 'Four'],
'X':[1,2,3,4]})
df = df.set_index(['A', pd.CategoricalIndex(df['B'], categories=['One','Two','Three', 'Four'], ordered=True)])
del df['B']
来源:https://stackoverflow.com/questions/49318345/index-sort-order-of-a-multi-index-dataframe-does-not-respect-categorical-index-o