How can I subtract timedeltas listed in same column?

别等时光非礼了梦想. 提交于 2021-01-29 08:14:05

问题


I have this data:

                                          count  MySum  MyCount
User Name time_diff           Logon Time                       
192309    -134 days +18:08:00 1               1     34       34
          -129 days +11:05:00 1               1     34       34
          -124 days +15:00:00 1               1     34       34
          -124 days +11:04:00 1               1     34       34
          -124 days +01:26:00 1               1     34       34
...                                         ...    ...      ...
193143    -116 days +21:53:00 1               1      1        1
164883    -119 days +15:32:00 1               1      1        1
200580    -1 days +19:39:00   1               1      1        1
183396    -102 days +01:50:00 1               1      1        1
184806    -6 days +06:15:00   1               1      1        1

Derived from this:

new_col_dipl = (counting_dipl_new.to_frame('count').join((counting_dipl_new.reset_index().groupby('User Name').agg(MySum=('Logon Time', 'sum'), MyCount=('Logon Time', 'count'))), on='User Name'))

I want to calculate the differences between timedeltas for those User Name's having more than one time_diff observation listed, and by subtracting values in the time_diff column in a new column, call time_diffs_per_user, and place that to the right of the MyCount column.

Would love this (the new col, i.e. 'time_diffs_per_user' is added as an example as are the almost accurate timestamps in the column):

                                          count  MySum  MyCount   time_diffs_per_user
User Name time_diff           Logon Time                       
192309    -134 days +18:08:00 1               1     34       34   
          -129 days +11:05:00 1               1     34       34   5 days
          -124 days +15:00:00 1               1     34       34   5 days
          -124 days +11:04:00 1               1     34       34   4 hours
          -124 days +01:26:00 1               1     34       34   ....
...                                         ...    ...      ...
193143    -116 days +21:53:00 1               1      1        1
164883    -119 days +15:32:00 1               1      1        1
200580    -1 days +19:39:00   1               1      1        1
183396    -102 days +01:50:00 1               1      1        1
184806    -6 days +06:15:00   1               1      1        1

Tried this:

dipl_count['time_diffs_per_user'] = dipl_count.groupby('User Name')['day'].apply(lambda x: x.shift(-1) - x) / p.timedelta64(1, 'h')

print("\nnew_col_dipl['time_diff_new']")
print(dipl_count['time_diff_new'])

But I got this error:

numpy.core._exceptions.UFuncTypeError: ufunc 'true_divide' cannot use operands with types dtype('float64') and dtype('<m8[ns]')

Additional question (basically what I think this is all about): how can I do a time series analysis for each User Name, primarily using the existing 'time_diff' column?

Thank you, for your kind support. BR Hubsandspokes

来源:https://stackoverflow.com/questions/65240705/how-can-i-subtract-timedeltas-listed-in-same-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!