I would like to calculate the mean
and standard deviation
of a timedelta
by bank from a dataframe
with two columns shown
You need to convert timedelta
to some numeric value, e.g. int64
by values
what is most accurate, because convert to ns
is what is the numeric representation of timedelta
:
dropped['new'] = dropped['diff'].values.astype(np.int64)
means = dropped.groupby('bank').mean()
means['new'] = pd.to_timedelta(means['new'])
std = dropped.groupby('bank').std()
std['new'] = pd.to_timedelta(std['new'])
Another solution is to convert values to seconds
by total_seconds, but that is less accurate:
dropped['new'] = dropped['diff'].dt.total_seconds()
means = dropped.groupby('bank').mean()