Taking Median of two datetime values or columns

谁都会走 提交于 2019-12-13 18:13:12

问题


For the below data I want to take the middle value or the middle time of the first two timestamps in each row and then subract that third timestamp

What would be the best way to take the median value or middle datime of two timestamps?

the output expected is in minutes the difference of two timestamps.

It is the median or mean of the first two minus the third timestamp.

it is the middle value or timestamp of 2018-12-21 23:31:24.615 and 2018-12-21 23:31:26.659.

Once I have that value I want to subtract the third timestamp of 2018-12-21 23:31:27.975. The output would represent a value of minutes.


回答1:


If you just want the middle value of the datetime column, you can do this:

df['linked_trip_pickup_departed_time'].astype('datetime64[ns]').quantile(.5)
df['pickup_departed_time_utc'].astype('datetime64[ns]').quantile(.5)

This will give you the median for each datetime column. Now, you can subtract it.




回答2:


Assuming the df looks like:

df = pd.DataFrame(data={'time1':['2018-12-21 23:31:24.615','2018-12-22 01:33:26.015'],'time2':['2018-12-21 23:31:26.659','2018-12-22 01:33:32.865'],'time3':['2018-12-21 23:31:27.975','2018-12-22 01:59:05.136']})

    time1                   time2                   time3
0   2018-12-21 23:31:24.615 2018-12-21 23:31:26.659 2018-12-21 23:31:27.975
1   2018-12-22 01:33:26.015 2018-12-22 01:33:32.865 2018-12-22 01:59:05.136

Convert 'to_datetime'

df[['time1','time2','time3']] = df[['time1','time2','time3']].apply(pd.to_datetime,errors='coerce')

creating a column having the average of the first 2 columns:

my_list= []
for i in df.index:
    my_list.append(pd.to_datetime((df['time1'][i].value + df['time2'][i].value)/2.0))
df['avg'] = my_list

or simply :

df['avg'] = [(pd.to_datetime((df['time1'][i].value + df['time2'][i].value)/2.0)) for i in df.index]

finding difference of column3 and avg:

(df.time3-df.avg).astype('timedelta64[m]')

output:

0     0.0
1    25.0
dtype: float64

P.S : you have to replace columns time1,time2 and time3 with the column names in your dataframe.



来源:https://stackoverflow.com/questions/53940340/taking-median-of-two-datetime-values-or-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!