Pandas Transform Position/Rank in Group

拜拜、爱过 提交于 2019-12-11 04:27:23

问题


I have the following DataFrame with two groups of animals and how much food they eat each day,

df = pd.DataFrame({'animals': ['cat', 'cat', 'dog', 'dog', 'rat', 
                               'cat', 'rat', 'rat', 'dog', 'cat'],
                   'food': [1, 2, 2, 5, 3, 1, 4, 0, 6, 5]},
                  index=pd.MultiIndex.from_product([['group1'] + ['group2'],
                                                    list(range(5))])
                     ).rename_axis(['groups', 'day'])

df

            animals food
groups  day     
group1  0   cat     1
        1   cat     2
        2   dog     2
        3   dog     5
        4   rat     3
group2  0   cat     1
        1   rat     4
        2   rat     0
        3   dog     6
        4   cat     5

I can "map"/transform this into a new column to see how much food each individual animal should be given per day daily_meal.

df['daily_meal'] = df.groupby(['animals', 'groups']).transform('mean')
df

            animals food    daily_meal
groups  day         
group1  0   cat     1       1.5
        1   cat     2       1.5
        2   dog     2       3.5
        3   dog     5       3.5
        4   rat     3       3.0
group2  0   cat     1       3.0
        1   rat     4       2.0
        2   rat     0       2.0
        3   dog     6       6.0
        4   cat     5       3.0

I now wish to know where that daily_meal ranks within each group, and "map"/transform this into a new column called group_rank. How can I do this?

e.g.

            animals food    daily_meal   group_rank
groups  day         
group1  0   cat     1       1.5          1
        1   cat     2       1.5          1
        2   dog     2       3.5          3
        3   dog     5       3.5          3
        4   rat     3       3.0          2

group2  0   cat     1       3.0          2
        1   rat     4       2.0          1
        2   rat     0       2.0          1
        3   dog     6       6.0          3
        4   cat     5       3.0          2

回答1:


Use double transform:

df['daily_meal'] = df.groupby(['animals', 'groups'])['food'].transform('mean')
df['group_rank'] = df.groupby('groups')['daily_meal'].rank(method='dense')
print (df)
           animals  food  daily_meal  group_rank
groups day                                      
group1 0       cat     1         1.5         1.0
       1       cat     2         1.5         1.0
       2       dog     2         3.5         3.0
       3       dog     5         3.5         3.0
       4       rat     3         3.0         2.0
group2 0       cat     1         3.0         2.0
       1       rat     4         2.0         1.0
       2       rat     0         2.0         1.0
       3       dog     6         6.0         3.0
       4       cat     5         3.0         2.0

Or:

s = df.groupby(['animals', 'groups'])['food'].transform('mean')
df['group_rank'] = s.groupby('groups').transform(lambda x: x.rank(method='dense'))
print (df)
           animals  food  group_rank
groups day                          
group1 0       cat     1         1.0
       1       cat     2         1.0
       2       dog     2         3.0
       3       dog     5         3.0
       4       rat     3         2.0
group2 0       cat     1         2.0
       1       rat     4         1.0
       2       rat     0         1.0
       3       dog     6         3.0
       4       cat     5         2.0

Thanks Scott Boston for improving solution:

df['daily_meal'] = df.groupby(['animals', 'groups'])['food'].transform('mean')
df['group_rank'] = df.groupby('groups')['daily_meal'].rank(method='dense')

s = df.groupby(['animals', 'groups'])['food'].transform('mean')
df['group_rank'] = s.groupby('groups').rank(method='dense')



回答2:


Using get_level_values + transform + rank

df.groupby([df.index.get_level_values(level='groups')])['daily_meal '].apply(lambda x : x.rank(method ='dense'))
Out[1068]: 
groups  day
group1  0      1.0
        1      1.0
        2      3.0
        3      3.0
        4      2.0
group2  0      2.0
        1      1.0
        2      1.0
        3      3.0
        4      2.0
Name: daily_meal , dtype: float64

After assign

df['group_rank']=df.groupby([df.index.get_level_values(level='groups')])['daily_meal '].apply(lambda x : x.rank(method ='dense'))
df
Out[1070]: 
           animals  food  daily_meal   group_rank
groups day                                       
group1 0       cat     1          1.5         1.0
       1       cat     2          1.5         1.0
       2       dog     2          3.5         3.0
       3       dog     5          3.5         3.0
       4       rat     3          3.0         2.0
group2 0       cat     1          3.0         2.0
       1       rat     4          2.0         1.0
       2       rat     0          2.0         1.0
       3       dog     6          6.0         3.0
       4       cat     5          3.0         2.0

Here is the method I get the daily_meal

df['daily_meal ']=df.groupby([df.index.get_level_values(level='groups'),df.animals])['food'].transform('mean')


来源:https://stackoverflow.com/questions/47775927/pandas-transform-position-rank-in-group

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!