Custom sorting in pandas dataframe

前端 未结 4 2150
情歌与酒
情歌与酒 2020-11-22 16:34

I have python pandas dataframe, in which a column contains month name.

How can I do a custom sort using a dictionary, for example:

custom_dict = {\'         


        
4条回答
  •  忘掉有多难
    2020-11-22 16:54

    Pandas 0.15 introduced Categorical Series, which allows a much clearer way to do this:

    First make the month column a categorical and specify the ordering to use.

    In [21]: df['m'] = pd.Categorical(df['m'], ["March", "April", "Dec"])
    
    In [22]: df  # looks the same!
    Out[22]:
       a  b      m
    0  1  2  March
    1  5  6    Dec
    2  3  4  April
    

    Now, when you sort the month column it will sort with respect to that list:

    In [23]: df.sort_values("m")
    Out[23]:
       a  b      m
    0  1  2  March
    2  3  4  April
    1  5  6    Dec
    

    Note: if a value is not in the list it will be converted to NaN.


    An older answer for those interested...

    You could create an intermediary series, and set_index on that:

    df = pd.DataFrame([[1, 2, 'March'],[5, 6, 'Dec'],[3, 4, 'April']], columns=['a','b','m'])
    s = df['m'].apply(lambda x: {'March':0, 'April':1, 'Dec':3}[x])
    s.sort_values()
    
    In [4]: df.set_index(s.index).sort()
    Out[4]: 
       a  b      m
    0  1  2  March
    1  3  4  April
    2  5  6    Dec
    

    As commented, in newer pandas, Series has a replace method to do this more elegantly:

    s = df['m'].replace({'March':0, 'April':1, 'Dec':3})
    

    The slight difference is that this won't raise if there is a value outside of the dictionary (it'll just stay the same).

提交回复
热议问题