Drop duplicates keeping the row with the highest value in another column

后端 未结 1 654
庸人自扰
庸人自扰 2020-12-30 12:27
a = [[\'John\', \'Mary\', \'John\'], [10,22,50]])
df1 = pd.DataFrame(a, columns=[\'Name\', \'Count\'])

Given a data frame like this I want to compa

相关标签:
1条回答
  • 2020-12-30 12:41

    Either sort_values and drop_duplicates,

    df1.sort_values('Count').drop_duplicates('Name', keep='last')
    
       Name  Count
    1  Mary     22
    2  John     50
    

    Or, like miradulo said, groupby and max.

    df1.groupby('Name')['Count'].max().reset_index()
    
       Name  Count
    0  John     50
    1  Mary     22
    
    0 讨论(0)
提交回复
热议问题