Pandas: Difference between largest and smallest value within group

后端 未结 3 1721
心在旅途
心在旅途 2020-11-30 02:45

Given a data frame that looks like this

GROUP VALUE
  1     5
  2     2
  1     10
  2     20
  1     7

I would like to compute the differe

3条回答
  •  情歌与酒
    2020-11-30 02:48

    Using @unutbu 's df

    per timing
    unutbu's solution is best over large data sets

    import pandas as pd
    import numpy as np
    
    df = pd.DataFrame({'GROUP': [1, 2, 1, 2, 1], 'VALUE': [5, 2, 10, 20, 7]})
    
    df.groupby('GROUP')['VALUE'].agg(np.ptp)
    
    GROUP
    1     5
    2    18
    Name: VALUE, dtype: int64
    

    np.ptp docs returns the range of an array


    timing
    small df

    large df
    df = pd.DataFrame(dict(GROUP=np.arange(1000000) % 100, VALUE=np.random.rand(1000000)))

    large df
    many groups
    df = pd.DataFrame(dict(GROUP=np.arange(1000000) % 10000, VALUE=np.random.rand(1000000)))

提交回复
热议问题