pandas - get most recent value of a particular column indexed by another column (get maximum value of a particular column indexed by another column)

前端 未结 6 704
深忆病人
深忆病人 2020-12-01 11:05

I have the following dataframe:

   obj_id   data_date   value
0  4        2011-11-01  59500    
1  2        2011-10-01  35200 
2  4        2010-07-31  24860          


        
6条回答
  •  孤街浪徒
    2020-12-01 11:25

    I believe to have found a more appropriate solution based off the ones in this thread. However mine uses the apply function of a dataframe instead of the aggregate. It also returns a new dataframe with the same columns as the original.

    df = pd.DataFrame({
    'CARD_NO': ['000', '001', '002', '002', '001', '111'],
    'DATE': ['2006-12-31 20:11:39','2006-12-27 20:11:53','2006-12-28 20:12:11','2006-12-28 20:12:13','2008-12-27 20:11:53','2006-12-30 20:11:39']})
    
    print df 
    df.groupby('CARD_NO').apply(lambda df:df['DATE'].values[df['DATE'].values.argmax()])
    

    Original

    CARD_NO                 DATE
    0     000  2006-12-31 20:11:39
    1     001  2006-12-27 20:11:53
    2     002  2006-12-28 20:12:11
    3     002  2006-12-28 20:12:13
    4     001  2008-12-27 20:11:53
    5     111  2006-12-30 20:11:39
    

    Returned dataframe:

    CARD_NO
    000        2006-12-31 20:11:39
    001        2008-12-27 20:11:53
    002        2006-12-28 20:12:13
    111        2006-12-30 20:11:39
    

提交回复
热议问题