Keep maximum value per group including repetitions

和自甴很熟 提交于 2021-02-10 05:30:32

问题


Let's say I have a dataframe like this:

    a   b   c
0   x1  y1  9
1   x1  y2  9
2   x1  y3  4
3   x2  y4  2
4   x2  y5  10
5   x2  y6  5
6   x3  y7  6
7   x3  y8  4
8   x3  y9  8
9   x4  y10 11
10  x4  y11 11
11  x4  y12 11

I first want to do a grouped sort of column c (grouped by column a), and then I want to retain all the rows in each group that have the highest values of column c. So the output will look like:

    a   b   c
0   x1  y1  9
1   x1  y2  9
4   x2  y5  10
8   x3  y9  8
9   x4  y10 11
10  x4  y11 11
11  x4  y12 11

Is there a clean way of doing so without using any loops, etc.?


回答1:


You could groupby column a and find the max per group, and merge back the resulting dataframe to keep the matching rows:

df.merge(df.groupby('a').c.max())

    a    b   c
0  x1   y1   9
1  x1   y2   9
2  x2   y5  10
3  x3   y9   8
4  x4  y10  11
5  x4  y11  11
6  x4  y12  11



回答2:


you can do it with groupby.transform with max like:

df.loc[df['c'].eq(df.groupby('a')['c'].transform('max')), :]
     a    b   c
0   x1   y1   9
1   x1   y2   9
4   x2   y5  10
8   x3   y9   8
9   x4  y10  11
10  x4  y11  11
11  x4  y12  11



回答3:


You can use the function groupby and sort_values

df = df.groupby(['a'])['c'].sum().reset_index()
df = df.sort_values(by=['c'], ascending=False)


来源:https://stackoverflow.com/questions/61801384/keep-maximum-value-per-group-including-repetitions

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!