the rolling regression in R using roll apply

时间秒杀一切 提交于 2019-11-30 19:49:11

It is not really clear what your data actually is (use dput(example_data) to give reproducible examples).

But the lm call in your example is simply doing the same regression over and over again (your x is not changing) and as josilber points out, it is supposed to be a function. Here is an example where all the data is in the data.frame allRegData and it has at least two columns, one named y and another named x:

require(zoo)
rollapply(zoo(allRegData),
          width=262,
          FUN = function(Z) 
          { 
             t = lm(formula=y~x, data = as.data.frame(Z), na.rm=T); 
             return(t$coef) 
          },
          by.column=FALSE, align="right") 

I guess your questions are

  1. that you want to apply rolling regression on 262 width window of data for roughly 6 years yielding 1572 which is close to your 1596 observations with six covariates.
  2. figure out how to solve your problem with rollapply.
  3. an issue with loading in a data set from Excel.

It seems hard to help you with 3. since you do not provide the data set or the R code you use. Starting with 1., then you can use the rollRegres package I have made. What you would do is something like this

#####
# simulate data
width  <- 262L
n_yr   <- 6L
n_covs <- 6L
set.seed(23847996)
X <- matrix(rnorm(n_yr * width), ncol = n_covs, dimnames = list(NULL, paste0("X", 1:n_covs)))
df <- data.frame(Y = rnorm(n_yr * width), X)
head(df)
#R         Y     X1     X2     X3      X4     X5      X6
#R 1 -1.1478  0.516 -1.381  0.776 -0.0992  1.254 -0.8444
#R 2 -0.0542  1.328  1.411 -1.206  0.2560  0.975  0.9534
#R 3 -0.8350 -1.402  1.190  0.591 -1.5928  0.330  1.0806
#R 4 -0.5902  0.937 -0.182  0.193  0.1575  0.217 -0.2613
#R 5  1.7891 -0.608 -1.090  0.180 -1.1765 -0.992 -0.8831
#R 6  2.0108  0.259 -0.129  0.261  1.6694 -1.822  0.0616

#####
# estimate coefs
library(rollRegres)
fits <- roll_regres(Y ~ X1 + X2 + X3 + X4 + X5 + X6, df, width = width)
tail(fits$coefs) # estimated coefs
#R      (Intercept)     X1      X2      X3     X4      X5      X6
#R 1567    0.006511 0.0557 -0.0243 0.00907 0.0234 -0.0370 -0.0553
#R 1568    0.005816 0.0569 -0.0233 0.00798 0.0248 -0.0370 -0.0560
#R 1569    0.006406 0.0566 -0.0231 0.00814 0.0243 -0.0385 -0.0555
#R 1570    0.004898 0.0519 -0.0213 0.00773 0.0323 -0.0391 -0.0588
#R 1571    0.002922 0.0532 -0.0211 0.00809 0.0307 -0.0377 -0.0609
#R 1572    0.000771 0.0538 -0.0262 0.00580 0.0309 -0.0363 -0.0658

Now, regarding 2. then you can do something like what Hans Roggeman shows but a version that works with multiple regression as you request

library(zoo)
c2 <- rollapply(
  df, width = width, function(z){
    coef(lm(Y ~ X1 + X2 + X3 + X4 + X5 + X6, as.data.frame(z)))
  }, by.column = FALSE, fill = NA_real_, align = "right")
all.equal(fits$coefs, c2, check.attributes = FALSE) # gives the same
#R [1] TRUE

It is much slower though

microbenchmark::microbenchmark(
  rollRegrs = roll_regres(Y ~ X1 + X2 + X3 + X4 + X5 + X6, df, width = width),
  rollapply = rollapply(
    df, width = width, function(z){
      coef(lm(Y ~ X1 + X2 + X3 + X4 + X5 + X6, as.data.frame(z)))
    }, by.column = FALSE, fill = NA_real_, align = "right"), times = 25)
#R Unit: milliseconds
#R       expr     min      lq    mean  median      uq     max neval
#R  rollRegrs    1.73    1.98    2.31    2.37    2.64    3.01    25
#R  rollapply 1726.74 1798.37 1881.25 1834.00 1964.62 2178.58    25

The rollapply version can be faster if you use lm.fit but it it still slower than roll_regres.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!