问题
An irregular time series data is stored in a pandas.DataFrame. A DatetimeIndex has been set. I need the time difference between consecutive entries in the index.
I thought it would be as simple as
data.index.diff()
but got
AttributeError: 'DatetimeIndex' object has no attribute 'diff'
I tried
data.index - data.index.shift(1)
but got
ValueError: Cannot shift with no freq
I do not want to infer or enforce a frequency first before doing this operation. There are large gaps in the time series that would be expanded to large runs of nan. The point is to find these gaps first.
So, what is a clean way to do this seemingly simple operation?
回答1:
There is no implemented yet diff function for index.
But is possible convert index to Series first by Index.to_series if need original index or Series contructor with no index parameetr if need default index values:
rng = pd.to_datetime(['2015-01-10','2015-01-12','2015-01-13'])
data = pd.DataFrame({'a': range(3)}, index=rng)
print (data)
a
2015-01-10 0
2015-01-12 1
2015-01-13 2
a = data.index.to_series().diff()
print (a)
2015-01-10 NaT
2015-01-12 2 days
2015-01-13 1 days
dtype: timedelta64[ns]
a = pd.Series(data.index).diff()
print (a)
0 NaT
1 2 days
2 1 days
dtype: timedelta64[ns]
回答2:
This question is a bit old but anyway...
I use numpy.diff(data.index) to get the time deltas. Working fine.
来源:https://stackoverflow.com/questions/49277932/difference-pandas-datetimeindex-without-a-frequency