calculate the time difference between two consecutive rows in pandas

后端 未结 2 1352
梦如初夏
梦如初夏 2020-12-10 09:38

I have a pandas dataframe as follows

Dev_id     Time
88345      13:40:31
87556      13:20:33
88955      13:05:00
.....      ........
85678      12:15:28


        
相关标签:
2条回答
  • 2020-12-10 09:54

    Problem is pandas need datetimes or timedeltas for diff function, so first converting by to_timedelta, then get total_seconds and divide by 60:

    df['Time_diff'] = pd.to_timedelta(df['Time'].astype(str)).diff(-1).dt.total_seconds().div(60)
    #alternative
    #df['Time_diff'] = pd.to_datetime(df['Time'].astype(str)).diff(-1).dt.total_seconds().div(60)
    print (df)
       Dev_id      Time  Time_diff
    0   88345  13:40:31  19.966667
    1   87556  13:20:33  15.550000
    2   88955  13:05:00  49.533333
    3   85678  12:15:28        NaN
    

    If want floor or round per minutes:

    df['Time_diff'] = (pd.to_timedelta(df['Time'].astype(str))
                         .diff(-1)
                         .dt.floor('T')
                         .dt.total_seconds()
                         .div(60))
    print (df)
       Dev_id      Time  Time_diff
    0   88345  13:40:31       19.0
    1   87556  13:20:33       15.0
    2   88955  13:05:00       49.0
    3   85678  12:15:28        NaN
    
    0 讨论(0)
  • 2020-12-10 10:01

    You should first convert / cast df['Time'] column to pd.Timedelta and then do the substraction

    0 讨论(0)
提交回复
热议问题