问题
For the below data I want to take the middle value or the middle time of the first two timestamps in each row and then subract that third timestamp
What would be the best way to take the median value or middle datime of two timestamps?
the output expected is in minutes the difference of two timestamps.
It is the median or mean of the first two minus the third timestamp.
it is the middle value or timestamp of 2018-12-21 23:31:24.615
and 2018-12-21 23:31:26.659
.
Once I have that value I want to subtract the third timestamp of 2018-12-21 23:31:27.975
. The output would represent a value of minutes.
回答1:
If you just want the middle value of the datetime column, you can do this:
df['linked_trip_pickup_departed_time'].astype('datetime64[ns]').quantile(.5)
df['pickup_departed_time_utc'].astype('datetime64[ns]').quantile(.5)
This will give you the median
for each datetime column. Now, you can subtract it.
回答2:
Assuming the df looks like:
df = pd.DataFrame(data={'time1':['2018-12-21 23:31:24.615','2018-12-22 01:33:26.015'],'time2':['2018-12-21 23:31:26.659','2018-12-22 01:33:32.865'],'time3':['2018-12-21 23:31:27.975','2018-12-22 01:59:05.136']})
time1 time2 time3
0 2018-12-21 23:31:24.615 2018-12-21 23:31:26.659 2018-12-21 23:31:27.975
1 2018-12-22 01:33:26.015 2018-12-22 01:33:32.865 2018-12-22 01:59:05.136
Convert 'to_datetime'
df[['time1','time2','time3']] = df[['time1','time2','time3']].apply(pd.to_datetime,errors='coerce')
creating a column having the average of the first 2 columns:
my_list= []
for i in df.index:
my_list.append(pd.to_datetime((df['time1'][i].value + df['time2'][i].value)/2.0))
df['avg'] = my_list
or simply :
df['avg'] = [(pd.to_datetime((df['time1'][i].value + df['time2'][i].value)/2.0)) for i in df.index]
finding difference of column3 and avg:
(df.time3-df.avg).astype('timedelta64[m]')
output:
0 0.0
1 25.0
dtype: float64
P.S : you have to replace columns time1
,time2
and time3
with the column names in your dataframe.
来源:https://stackoverflow.com/questions/53940340/taking-median-of-two-datetime-values-or-columns