Python Multiindex Dataframe remove maximum

半世苍凉 提交于 2021-02-07 04:05:59

问题


I am struggling with MultiIndex DataFrame in python pandas.

Suppose I have a df like this:

                    count    day     
group    name

  A      Anna        10      Monday
         Beatrice    15      Tuesday

  B      Beatrice    15      Wednesday
         Cecilia     20      Thursday

What I need is to find the maximum in name for each group and remove it from the dataframe.

The final df would look like:

                    count    day     
group    name

  A      Anna        10      Monday

  B      Beatrice    15      Wednesday

Does any of you have any idea how to do this? I am running out of ideas...

Thanks in advance!

EDIT:

What if the original dataframe is:

                   count    day     
group    name

  A      Anna        10      Monday
         Beatrice    15      Tuesday

  B      Beatrice    20      Wednesday
         Cecilia     15      Thursday

and the final df needs to be:

                    count    day     
group    name

  A      Anna        10      Monday

  B      Beatrice    20      Wednesday

回答1:


UPDATE:

In [386]: idx = (df.reset_index('name')
                   .groupby('group')['name']
                   .max()
                   .reset_index()
                   .values.tolist())

In [387]: df.loc[df.index.difference(idx)]
Out[387]:
                count        day
group name
A     Anna         10     Monday
B     Beatrice     20  Wednesday

In [326]: df.loc[df.index.difference(df.groupby('group')['count'].idxmax())]
Out[326]:
                count        day
group name
A     Anna         10     Monday
B     Beatrice     15  Wednesday

PS most probably there is a better way to do this...



来源:https://stackoverflow.com/questions/49669129/python-multiindex-dataframe-remove-maximum

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!