Forecasting time series data

拥有回忆 提交于 2019-11-26 21:48:26

Here's what I did:

x$Date = as.Date(x$Date,format="%m/%d/%Y")
x = xts(x=x$Used, order.by=x$Date)
# To get the start date (305)
#     > as.POSIXlt(x = "2011-11-01", origin="2011-11-01")$yday
##    [1] 304
# Add one since that starts at "0"
x.ts = ts(x, freq=365, start=c(2011, 305))
plot(forecast(ets(x.ts), 10))

Resulting in:

What can we learn from this:

  • Many of your steps can be combined reducing the number of intermediate objects you create
  • The output is still not as pretty as @joran, but it is still easily readable. 2011.85 means "day number 365*.85" (day 310 in the year).
  • Figuring out the day in a year can be done by using as.POSIXlt(x = "2011-11-01", origin="2011-11-01")$yday and figuring out the date from a day number can be done by using something like as.Date(310, origin="2011-01-01")

Update

You can drop even more intermediate steps, since there's no reason to first convert your data into an xts.

x = ts(x$Used, start=c(2011, as.POSIXlt("2011-11-01")$yday+1), frequency=365)
# NOTE: We have only selected the "Used" variable 
# since ts will take care of dates
plot(forecast(ets(x), 10))

This gives exactly the same result as the image above.

Update 2

Building on the solution provided by @joran, you can try:

# 'start' calculation = `as.Date("2011-11-01")-as.Date("2011-01-01")+1`
# No need to convert anything to dates at this point using xts
x = ts(x$Used, start=c(2011, 305), frequency=365)
# Directly plot your forecast without your axes
plot(forecast(ets(x), 10), axes = FALSE)
# Generate labels for your x-axis
a = seq(as.Date("2011-11-01"), by="weeks", length=11)
# Plot your axes.
# `at` is an approximation--there's probably a better way to do this, 
# but the logic is approximately 365.25 days in a year, and an origin
# date in R of `January 1, 1970`
axis(1, at = as.numeric(a)/365.25+1970, labels = a, cex.axis=0.6)
axis(2, cex.axis=0.6)

Which will yield:

Part of the problem in your original code is that after you have converted your data to an xts object, and converted that to a ts object, you lose the dates in your forecast points.

Compare the first column (Point) of your x.fore output to the following:

> forecast(ets(x), 10)
         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
2012.000       741.6437 681.7991 801.4884 650.1192 833.1682
2012.003       741.6437 676.1250 807.1624 641.4415 841.8459
2012.005       741.6437 670.9047 812.3828 633.4577 849.8298
2012.008       741.6437 666.0439 817.2435 626.0238 857.2637
2012.011       741.6437 661.4774 821.8101 619.0398 864.2476
2012.014       741.6437 657.1573 826.1302 612.4328 870.8547
2012.016       741.6437 653.0476 830.2399 606.1476 877.1399
2012.019       741.6437 649.1202 834.1672 600.1413 883.1462
2012.022       741.6437 645.3530 837.9345 594.3797 888.9078
2012.025       741.6437 641.7276 841.5599 588.8352 894.4523

Hopefully this helps you understand the problem with your original approach and improves your capacity with dealing with time series in R.

Update 3

Final, and more accurate solution--because I'm avoiding other work that I should actually be doing right now...

Use the lubridate package for better date handling:

require(lubridate)
y = ts(x$Used, start=c(2011, yday("2011-11-01")), frequency=365)
plot(forecast(ets(y), 10), xaxt="n")
a = seq(as.Date("2011-11-01"), by="weeks", length=11)
axis(1, at = decimal_date(a), labels = format(a, "%Y %b %d"), cex.axis=0.6)
abline(v = decimal_date(a), col='grey', lwd=0.5)

Resulting in:

Note the alternative method of identifying the start date for your ts object.

If you don't have any preferences over a specific model, I suggest you to use one that applies to a big range of situations:

library(forecast)
t.ser <- ts(used, start=c(2011,1), freq=12)
t.ets <- ets(t.ser)
t.fc <- forecast(t.ets,h=10)

This will give you the prediction for the next 10 months.

Being more technical, it uses Exponential Smoothing method that is a good choice for general situations. Depending on the kind of the data, there might be a better model specific to your use, but ets is a good general choice.

It's important to highlight that since you don't have two periods completed (less than 24 months), the model cannot detect sazonality, and therefore this won't be included on calculations.

Altering the plot to show the dates is fairly easy, by simply suppressing the axes in the original plot and then drawing them yourself:

plot(x.fore,axes = FALSE)
axis(2)
axis(1,at = pretty(1:72,n = 6),
       labels = (x$Date[1]-1) + pretty(1:72,n = 6),
       cex.axis = 0.65)

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!