Hierarchical Time Series

问题

I used the hts package in R to fit an HTS model on train data, used "arima" option to forecast and computed the accuracy on the holdout/test data. Here is my code:

library(hts)
data<-read.csv("C:/TS.csv")
ts_train <- ts(data[,-1],frequency=12, start=c(2000,1))
hts_train <- hts(ts_train, nodes=list(2, c(4, 2)))
data.test<-read.csv("C:/TStest.csv")
ts_test <- ts(data.test[,-1],frequency=12, start=c(2003,1))
hts_test <- hts(ts_test, nodes=list(2, c(4, 2)))
forecast <- forecast(hts_train, h=15, method="bu", fmethod="arima", keep.fitted = TRUE, keep.resid = TRUE)
accuracy<-accuracy.gts(forecast, hts_test)

Now, let's suppose I'm happy with the accuracy on the holdout sample and I'd like to lump the test data back with the train data and re-forecast using the full set.

I tried using this code:

data.full<-read.csv("C:/TS_full.csv")
ts_full <- ts(data.full[,-1],frequency=12, start=c(2000,1))
hts_full <- hts(ts_full, nodes=list(2, c(4, 2)))
forecast.full <- forecast(hts_full, h=15, method="bu", fmethod="arima", keep.fitted = TRUE, keep.resid = TRUE)

Now, I'm not sure that this is really the right way to do it as I don't know if ARIMA models that were used to estimate my train data are the same ARIMA models that I'm now using to forecast the full data set (I presume fmethod="arima" utilizes auto.arima) . I'd like them to remain the same models, otherwise the models evaluated by my out of sample accuracy measures are different from the models I used for the final forecast.

I see there is a FUN argument that represents "a user-defined function that returns an object which can be passed to the forecast function". Perhaps that argument can be used in the last line of my code somehow to make sure the models I fit on the train data are used to forecast the full data set?

Any suggestions on what sort of R code would help would be much appreciated.

回答1:

The functions are not set up to do that. However, it is not too difficult to do what you want. Here is some sample code

library(hts)
data <- htseg2

# Split data into training and test sets
hts_train <- window(data, end=2004)
hts_test <- window(data, start=2005)

# Fit models and compute forecasts on all nodes using training data
train <- aggts(hts_train)
fmodels <- list()
fc <- matrix(0, ncol=ncol(train), nrow=3)
for(i in 1:ncol(train))
{
  fmodels[[i]] <- auto.arima(train[,i])
  fc[,i] <- forecast(fmodels[[i]],h=3)$mean
}
forecast <- combinef(fc, nodes=data$nodes)
accuracy <- accuracy.gts(forecast, hts_test)

# Forecast on full data set without re-estimating parameters
full <- aggts(data)
fcfull <- matrix(0, ncol=ncol(full), nrow=15)
for(i in 1:ncol(full))
{
  fcfull[,i] <- forecast(Arima(full[,i], model=fmodels[[i]]),
                         h=15)$mean
}
forecast.full <- combinef(fcfull, nodes=data$nodes)

# Forecast on full data set with same models but re-estimated parameters
full <- aggts(data)
fcfull <- matrix(0, ncol=ncol(full), nrow=15)
for(i in 1:ncol(full))
{
  fcfull[,i] <- forecast(Arima(full[,i], 
                       order=fmodels[[i]]$arma[c(1,6,2)],
                       seasonal=fmodels[[i]]$arma[c(3,7,4)]),
                       h=15)$mean
}
forecast.full <- combinef(fcfull, nodes=data$nodes)

来源：https://stackoverflow.com/questions/33062141/hierarchical-time-series

标签

forecasting