idxmax() doesn't work on SeriesGroupBy that contains NaN

前端 未结 3 1921
借酒劲吻你
借酒劲吻你 2021-01-27 04:41

Here is my code

from pandas import DataFrame, Series
import pandas as pd
import numpy as np
income = DataFrame({\'name\': [\'Adam\', \'Bill\', \'Chris\', \'Dave\         


        
3条回答
  •  没有蜡笔的小新
    2021-01-27 05:21

    Since groupby preserves order of rows within each group, you sort income before groupby. Then, pick up the firsts using head:

    grouped=income.sort('income', ascending=False).groupby([ageBin])
    highestIncome = income.ix[grouped.head(1).index]
    #highestIncome is no longer ordered by age. 
    #If you want to recover this, sort it again.
    highestIncome.sort('age', inplace=True)
    

    By the way, beware that the reference manual does not mention that groupby will preserve the order. I think most clean solution would be fix pandas's idxmax to work. For me, it is a little bit strange why idxmax does not work while max works.

提交回复
热议问题