Extract row with maximum value in a group pandas dataframe

前端 未结 3 626
不知归路
不知归路 2020-12-04 16:05

A similar question is asked here: Python : Getting the Row which has the max value in groups using groupby

However, I just need one record per group even if there ar

相关标签:
3条回答
  • 2020-12-04 16:19

    You can use first

    In [14]: df.groupby('Mt').first()
    Out[14]: 
       Sp  Value  count
    Mt                 
    s1  a      1      3
    s2  c      3      5
    s3  f      6      6
    

    Update

    Set as_index=False to achieve your goal

    In [28]: df.groupby('Mt', as_index=False).first()
    Out[28]: 
       Mt Sp  Value  count
    0  s1  a      1      3
    1  s2  c      3      5
    2  s3  f      6      6 
    

    Update Again

    Sorry for misunderstanding what you mean. You can sort it first if you want the one with max count in a group

    In [196]: df.sort('count', ascending=False).groupby('Mt', as_index=False).first()
    Out[196]: 
       Mt Sp  Value  count
    0  s1  a      1      3
    1  s2  e      5     10
    2  s3  f      6      6
    
    0 讨论(0)
  • 2020-12-04 16:28

    Playing off of Roman Pekar's answer, I found that that the following code would work:

    from math import isnan
    df.iloc[[int(x) for x in df.groupby(by=df.Mt).apply(lambda x: x['count'].idxmax()).values if not isnan(y)]]
    

    Note the isnan condition, as my application had some nan entries in the column we are maximizing over.

    0 讨论(0)
  • 2020-12-04 16:32

    To get first occurence of maximum count you can use pandas.DataFrame.idxmax() function:

    >>> df.iloc[df.groupby(['Mt']).apply(lambda x: x['count'].idxmax())]
       Mt Sp  Value  count
    0  s1  a      1      3
    3  s2  d      4     10
    5  s3  f      6      6
    
    0 讨论(0)
提交回复
热议问题