How to calculate variance of least squares estimator using QR decomposition in R?

前端 未结 1 1941
既然无缘
既然无缘 2020-12-15 13:53

I\'m trying to learn QR decomposition, but can\'t figure out how to get the variance of beta_hat without resorting to traditional matrix calculations. I\'m practising with

相关标签:
1条回答
  • 2020-12-15 14:34

    setup (copying in your code)

    y <- iris$Sepal.Length
    x <- iris$Sepal.Width
    X <- cbind(1,x)
    n <- nrow(X)
    p <- ncol(X)
    qr.X <- qr(X)
    b <- (t(qr.Q(qr.X)) %*% y)[1:p]  ## can be optimized; see Remark 1 below
    R <- qr.R(qr.X)  ## can be optimized; see Remark 2 below
    beta <- as.vector(backsolve(R, b))
    res <- as.vector(y - X %*% beta)
    

    math

    computation

    Residual degree of freedom is n - p, so estimated variance is

    se2 <- sum(res ^ 2) / (n - p)
    

    Thus, the variance covariance matrix of estimated coefficients is

    V <- chol2inv(R) * se2
    
    #           [,1]         [,2]
    #[1,]  0.22934170 -0.07352916
    #[2,] -0.07352916  0.02405009
    

    validation

    Let's check the correctness by comparing with lm:

    fit <- lm(Sepal.Length ~ Sepal.Width, iris)
    
    vcov(fit)
    
    #            (Intercept) Sepal.Width
    #(Intercept)  0.22934170 -0.07352916
    #Sepal.Width -0.07352916  0.02405009
    

    Identical result!


    Remark 1 (skip forming 'Q' factor)

    Instead of b <- (t(qr.Q(qr.X)) %*% y)[1:p], you can use function qr.qty (to avoid forming 'Q' matrix):

    b <- qr.qty(qr.X, y)[1:p]
    

    Remark 2 (skip forming 'R' factor)

    You don't have to extract R <- qr.R(qr.X) for backsolve; using qr.X$qr is sufficient:

    beta <- as.vector(backsolve(qr.X$qr, b))
    

    Appendix: A function for estimation

    The above is the simplest demonstration. In practice column pivoting and rank-deficiency need be dealt with. The following is an implementation. X is a model matrix and y is the response. Results should be compared with lm(y ~ X + 0).

    qr_estimation <- function (X, y) {
      ## QR factorization
      QR <- qr(X)
      r <- QR$rank
      piv <- QR$pivot[1:r]
      ## estimate identifiable coefficients
      b <- qr.qty(QR, y)[1:r]
      beta <- backsolve(QR$qr, b, r)
      ## fitted values
      yhat <- base::c(X[, piv] %*% beta)
      ## residuals
      resi <- y - yhat
      ## error variance
      se2 <- base::c(crossprod(resi)) / (nrow(X) - r)
      ## variance-covariance for coefficients
      V <- chol2inv(QR$qr, r) * se2
      ## post-processing on pivoting and rank-deficiency
      p <- ncol(X)
      beta_full <- rep.int(NA_real_, p)
      beta_full[piv] <- beta
      V_full <- matrix(NA_real_, p, p)
      V_full[piv, piv] <- V
      ## return
      list(coefficients = beta_full, vcov = V_full,
           fitted.values = yhat, residuals = resi, sig = sqrt(se2))
      }
    
    0 讨论(0)
提交回复
热议问题