Datetime objects with pandas mean function

倾然丶 夕夏残阳落幕 提交于 2019-11-29 11:24:09
Alex

You can use datetime.timedelta

import functools
import operator

d={'one' : Series([1, 2, 3], index=['a', 'b', 'c']), 'two' :Series([datetime.datetime(2014, 7, 9) , datetime.datetime(2014, 7, 10) , datetime.datetime(2014, 7, 11) ], index=['a', 'b', 'c'])}
df = pd.DataFrame(d)

def avg_datetime(series):
    dt_min = series.min()
    deltas = [x-dt_min for x in series]
    return dt_min + functools.reduce(operator.add, deltas) / len(deltas)

print(avg_datetime(df['two']))

To simplify Alex's answer (I would have added this as a comment but I don't have sufficient reputation):

import datetime
import pandas as pd

d={'one': pd.Series([1, 2, 3], index=['a', 'b', 'c']), 
   'two': pd.Series([datetime.datetime(2014, 7, 9), 
           datetime.datetime(2014, 7, 10), 
           datetime.datetime(2014, 7, 11) ], 
           index=['a', 'b', 'c'])}
df = pd.DataFrame(d)

Which looks like:

   one   two
a   1   2014-07-09
b   2   2014-07-10
c   3   2014-07-11

Then calculate the mean of column "two" by:

(df.two - df.two.min()).mean() + df.two.min()

So, subtract the min of the timeseries, calculate the mean (or median) of the resulting timedeltas, and add back the min.

This issue is sort of resolved as of pandas=0.25. However mean can only currently be applied to a datetime series and not a datetime series within a DataFrame.

In [1]: import pandas as pd

In [2]: s = pd.Series([pd.datetime(2014, 7, 9), 
   ...:            pd.datetime(2014, 7, 10), 
   ...:            pd.datetime(2014, 7, 11)])

In [3]: s.mean()
Out[3]: Timestamp('2014-07-10 00:00:00')

Applying .mean() to a DataFrame containing a datetime series returns the same result as shown in the original question.

In [4]: df = pd.DataFrame({'numeric':[1,2,3],
   ...:               'datetime':s})

In [5]: df.mean()
Out[5]: 
numeric    2.0
dtype: float64
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!