How to extract the coefficients from a linear model without repeating my code in R?

问题

I am using a Montecarlo simulation for predicting mpg in the mtcars data. I want to extract the coefficients of all the variables in the dataframe to compute how many times each car has lower mpg than the other car. For example how many times Toyota Corona has less predicted mpg than Datsun 710. This is my initial code using only two independent variables. I want to expand this selection to use all the variables in the data frame without manually have to include all the variables in the data frame. Is there any way I can do this?

library(pacman)
pacman::p_load(data.table, fixest, stargazer, dplyr, magrittr)

df <- mtcars
fit <- lm(mpg~cyl + hp, data = df)
fit$coefficients[1]

beta_0 = fit$coefficients[1] # Intercept 
beta_1 = fit$coefficients[2] # Slope
beta_2 = fit$coefficients[3]
set.seed(1)  # Seed
n = 1000     # Sample size
M = 500      # Number of experiments/iterations


estimates_DT <- do.call("rbind",lapply(1:M, function(i) {
  # Generate data
  U_i = rnorm(n, mean = 0, sd = 2) # Error
  X_i_1 = rnorm(n, mean = 5, sd = 5) # First independent variable
  X_i_2 = rnorm(n, mean = 5, sd = 5) #Second ndependent variable
  Y_i = beta_0 + beta_1*X_i_1 + beta_2*X_i_2 + U_i  # Dependent variable
  
  # Formulate data.table
  data_i = data.table(Y = Y_i, X1 = X_i_1, X2 = X_i_2)
  
  # Run regressions
  ols_i <- fixest::feols(data = data_i, Y ~ X1 + X2)  
  ols_i$coefficients
}))

estimates_DT <- setNames(data.table(estimates_DT),c("beta_0","beta_1","beta_2"))

compareCarEstimations <- function(carname1="Mazda RX4",carname2="Datsun 710") {
  car1data <- mtcars[rownames(mtcars) == carname1,c("cyl","hp")]
  car2data <- mtcars[rownames(mtcars) == carname2,c("cyl","hp")]
  
  predsCar1 <- estimates_DT[["beta_0"]] + car1data$cyl*estimates_DT[["beta_1"]]+car1data$hp*estimates_DT[["beta_2"]]
  predsCar2 <- estimates_DT[["beta_0"]] + car2data$cyl*estimates_DT[["beta_1"]]+car2data$hp*estimates_DT[["beta_2"]]
  
  list(
    car1LowerCar2 = sum(predsCar1 < predsCar2),
    car2LowerCar1 = sum(predsCar1 >= predsCar2)
  )
}

compareCarEstimations("Toyota Corona", "Datsun 710")

回答1:

I haven't gone all the way through your example, but here is the nugget of how to construct a set of randomized predictor variables and matrix-multiply them by the coefficient vector to get predicted values:

Setup:

df <- mtcars
fit <- lm(mpg~cyl + hp, data = df)
n <- 1000

beta <- coef(fit) ## parameter vector (includes intercept)
npar <- length(beta)
X <- matrix(rnorm(n*npar),ncol=npar)  ## includes intercept
## scale columns by the corresponding sd
## (all identical in this case)
X <- sweep(X, MARGIN=2, FUN="*", STATS=rep(5,npar))
## shift columns by the corresponding mean
## (all identical in this case)
X <- sweep(X, MARGIN=2, FUN="+", STATS=rep(5,npar))
Y0 <- X %*% beta
Y <- rnorm(n, mean=Y0, sd=2)

来源：https://stackoverflow.com/questions/66053962/how-to-extract-the-coefficients-from-a-linear-model-without-repeating-my-code-in

标签

regression

montecarlo