Pandas aggregate count distinct

前端 未结 3 922
再見小時候
再見小時候 2020-12-04 07:34

Let\'s say I have a log of user activity and I want to generate a report of total duration and the number of unique users per day.

import numpy as np
import          


        
3条回答
  •  被撕碎了的回忆
    2020-12-04 08:23

    How about either of:

    >>> df
             date  duration user_id
    0  2013-04-01        30    0001
    1  2013-04-01        15    0001
    2  2013-04-01        20    0002
    3  2013-04-02        15    0002
    4  2013-04-02        30    0002
    >>> df.groupby("date").agg({"duration": np.sum, "user_id": pd.Series.nunique})
                duration  user_id
    date                         
    2013-04-01        65        2
    2013-04-02        45        1
    >>> df.groupby("date").agg({"duration": np.sum, "user_id": lambda x: x.nunique()})
                duration  user_id
    date                         
    2013-04-01        65        2
    2013-04-02        45        1
    

提交回复
热议问题