Having a lot of issues with time series objects in R

久未见 提交于 2019-12-12 03:26:11

问题


I am having an extraordinarily difficult time dealing with -any- time series objects of some budget data.

The original data is 14,460 rows of payments on ~1800 contracts, where each row has a DD/MM/YYYY and Amount feature. There are 5296 days between 1/1/2000 and 12/31/2014, but only 3133 of these days actually had payments. The days are therefore irregularly spaced, with more than one contract payment showing up on some days, and zero payments on others.

The main issue I'm having is the brutal stubbornness these time series object exhibit when being fed daily data that happens at irregular intervals. I've even merged the payments to a continuous date vector and am still having the same issue, namely with frequency, periodicity, or order.by.

CTS_date_V <- data.frame(Date = seq(as.Date("2000/07/01"), as.Date("2014/12/31"), "days"))
exp_d <- merge(exp, CTS_date_V, by="Date", all.y = T)
exp_d$Amount[is.na(exp_d$Amount)] <- 0

head(exp_d[,c("Amount","Date")],20)
      Amount       Date
1        0.0 2000-07-01
2        0.0 2000-07-02
3        0.0 2000-07-03
4        0.0 2000-07-04
5   269909.4 2000-07-05
6   130021.9 2000-07-06
7  1454135.3 2000-07-06
8   140065.5 2000-07-07
9        0.0 2000-07-08
10       0.0 2000-07-09
11       0.0 2000-07-10
12  274147.2 2000-07-11
13  106959.2 2000-07-11
14  119208.6 2000-07-12
15       0.0 2000-07-13
16       0.0 2000-07-14
17       0.0 2000-07-15
18  125402.5 2000-07-16
19 1170603.1 2000-07-16
20 1908463.3 2000-07-16

Most of the forecasting packages I am familiar with (as well as any of the questions I have found asked so far on SO) like fpp, forecasting, timeSeries, tseries, xts, and the like require a much more orderly Date feature to order.by or some other such concern.

My concern is over the appropriateness of the R package, not the statistical method. For example, I've tried a few different ways of building the time-series objects needed for the forecasting packages, including XTS, TS, and all of them have issues with either the frequency, the periodicity, or are asking for order.by.

UPDATE:

I build my xts object with

exp_xts <- xts(exp_d$Amount, start = min(exp$Date), end = max(exp$Date), order.by=exp_d$Date, colnames = "Amount", frequency = "") 

head(exp_xts,15)
                [,1]
2000-07-01       0.0
2000-07-02       0.0
2000-07-03       0.0
2000-07-04       0.0
2000-07-05  269909.4
2000-07-06  130021.9
2000-07-06 1454135.3
2000-07-07  140065.5
2000-07-08       0.0
2000-07-09       0.0
2000-07-10       0.0
2000-07-11  274147.2
2000-07-11  106959.2
2000-07-12  119208.6
2000-07-13       0.0

without an issue, and that object can be plot.xts()ed, but when I try

fit_xts <- stl(exp_xts, s.window="periodic",robust = T) 

is says

Error in if (frequency > 1 && abs(frequency - round(frequency)) < ts.eps) frequency <- round(frequency) : missing value where TRUE/FALSE needed`

回答1:


I tried using timeseries objects in R for a kaggle competition . What I found was that use timeseries predictions using the various timeseries forecast methods around didn't work well for me. What did work for me was to create a normal standard R dataframe, and create a neural network, based on contextual data, like: temperature, day of the week, day of the year, is today a holiday or not, and so on.

What this could mean for you, since you're not doing prediction, but simple statistical analysis is, maybe you don't need the time series functionality at all, and could simply use a standard 'R' dataframe?

I came 9th in the end, using a standard dataframe, and a neural net, no time series stuff :-)




回答2:


I think that it might be related with the following problem I encountered recently.

I tried to run autocorrelation function on time series (acf()). Data were converted into suitable time series format using xts/zoo package. However, acf() is a function, which exists in R without installing any package, so it is adjusted to data converted into time series by more 'traditional' function, which in this case is ts(). So this code returned the same error as in your case:

ts<- xts(dane.filtered$CRO, dane.filtered$Date_xts)
acf(ts, col="red")

The solution is to create time series using default time series function built into R (this code runs perfectly fine):

ts <- ts(dane.filtered$CRO)
acf(ts, col="red")

Hope it helps.



来源:https://stackoverflow.com/questions/28618719/having-a-lot-of-issues-with-time-series-objects-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!