How to get minimum of each group for each day based on hour criteria

前端 未结 4 1496
北海茫月
北海茫月 2020-12-22 01:29

I have given two dataframes below for you to test

df = pd.DataFrame({
    \'subject_id\':[1,1,1,1,1,1,1,1,1,1,1],
    \'time_1\' :[\'2173-04-03 12:35:00\',\'         


        
4条回答
  •  清酒与你
    2020-12-22 02:28

    Try this.

    from datetime import timedelta
    
    def f(x):
        dif = (x.iloc[0]-x.iloc[-1])//timedelta(minutes=1)
        return dif
    df1['time_1']= pd.to_datetime(df1['time_1'])
    df1['flag']= df1.val.diff().ne(0).cumsum()
    df1['t_d']=df1.groupby('flag')['time_1'].transform(f)
    df1['date'] = df1['time_1'].dt.date
    mask= df1['t_d'].ne(0)
    dfa=df1[mask].groupby(['flag','date']).first().reset_index()
    dfb=df1[~mask].groupby('date').first().reset_index().dropna(how='any')
    df_f = dfa.merge(dfb, how='outer')
    df_f.drop_duplicates(subset='date', keep='first', inplace=True)
    df_f.drop(['flag','date','t_d'], axis=1, inplace=True)
    df_f
    

    Output.

     subject_id     time_1         val
    0   1   2173-04-03 12:35:00     5
    1   1   2173-04-04 11:30:00     5
    2   1   2173-04-05 16:00:00     5
    5   1   2173-04-06 04:00:00     3
    

提交回复
热议问题