How to extract the coefficients from a linear model without repeating my code in R?

梦想的初衷 提交于 2021-02-11 12:49:43

问题


I am using a Montecarlo simulation for predicting mpg in the mtcars data. I want to extract the coefficients of all the variables in the dataframe to compute how many times each car has lower mpg than the other car. For example how many times Toyota Corona has less predicted mpg than Datsun 710. This is my initial code using only two independent variables. I want to expand this selection to use all the variables in the data frame without manually have to include all the variables in the data frame. Is there any way I can do this?

library(pacman)
pacman::p_load(data.table, fixest, stargazer, dplyr, magrittr)

df <- mtcars
fit <- lm(mpg~cyl + hp, data = df)
fit$coefficients[1]

beta_0 = fit$coefficients[1] # Intercept 
beta_1 = fit$coefficients[2] # Slope
beta_2 = fit$coefficients[3]
set.seed(1)  # Seed
n = 1000     # Sample size
M = 500      # Number of experiments/iterations


estimates_DT <- do.call("rbind",lapply(1:M, function(i) {
  # Generate data
  U_i = rnorm(n, mean = 0, sd = 2) # Error
  X_i_1 = rnorm(n, mean = 5, sd = 5) # First independent variable
  X_i_2 = rnorm(n, mean = 5, sd = 5) #Second ndependent variable
  Y_i = beta_0 + beta_1*X_i_1 + beta_2*X_i_2 + U_i  # Dependent variable
  
  # Formulate data.table
  data_i = data.table(Y = Y_i, X1 = X_i_1, X2 = X_i_2)
  
  # Run regressions
  ols_i <- fixest::feols(data = data_i, Y ~ X1 + X2)  
  ols_i$coefficients
}))

estimates_DT <- setNames(data.table(estimates_DT),c("beta_0","beta_1","beta_2"))

compareCarEstimations <- function(carname1="Mazda RX4",carname2="Datsun 710") {
  car1data <- mtcars[rownames(mtcars) == carname1,c("cyl","hp")]
  car2data <- mtcars[rownames(mtcars) == carname2,c("cyl","hp")]
  
  predsCar1 <- estimates_DT[["beta_0"]] + car1data$cyl*estimates_DT[["beta_1"]]+car1data$hp*estimates_DT[["beta_2"]]
  predsCar2 <- estimates_DT[["beta_0"]] + car2data$cyl*estimates_DT[["beta_1"]]+car2data$hp*estimates_DT[["beta_2"]]
  
  list(
    car1LowerCar2 = sum(predsCar1 < predsCar2),
    car2LowerCar1 = sum(predsCar1 >= predsCar2)
  )
}

compareCarEstimations("Toyota Corona", "Datsun 710")

回答1:


I haven't gone all the way through your example, but here is the nugget of how to construct a set of randomized predictor variables and matrix-multiply them by the coefficient vector to get predicted values:

Setup:

df <- mtcars
fit <- lm(mpg~cyl + hp, data = df)
n <- 1000
beta <- coef(fit) ## parameter vector (includes intercept)
npar <- length(beta)
X <- matrix(rnorm(n*npar),ncol=npar)  ## includes intercept
## scale columns by the corresponding sd
## (all identical in this case)
X <- sweep(X, MARGIN=2, FUN="*", STATS=rep(5,npar))
## shift columns by the corresponding mean
## (all identical in this case)
X <- sweep(X, MARGIN=2, FUN="+", STATS=rep(5,npar))
Y0 <- X %*% beta
Y <- rnorm(n, mean=Y0, sd=2)


来源:https://stackoverflow.com/questions/66053962/how-to-extract-the-coefficients-from-a-linear-model-without-repeating-my-code-in

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!