Drop duplicates keeping the row with the highest value in another column

后端未结

关注

 1  659

a = [[\'John\', \'Mary\', \'John\'], [10,22,50]])
df1 = pd.DataFrame(a, columns=[\'Name\', \'Count\'])

Given a data frame like this I want to compa

相关标签:

1条回答

2020-12-30 12:41

Either sort_values and drop_duplicates,

df1.sort_values('Count').drop_duplicates('Name', keep='last')

   Name  Count
1  Mary     22
2  John     50

Or, like miradulo said, groupby and max.

df1.groupby('Name')['Count'].max().reset_index()

   Name  Count
0  John     50
1  Mary     22

0 讨论(0)