Loop multiple 'multiple linear regressions' in R

巧了我就是萌 提交于 2019-12-23 02:32:17

问题


I have a database where I want to do several multiple regressions. They all look like this:

fit <- lm(Variable1 ~ Age + Speed + Gender + Mass, data=Data)

The only variable changing is variable1. Now I want to loop or use something from the apply family to loop several variables at the place of variable1. These variables are columns in my datafile. Can someone help me to solve this problem? Many thanks!

what I tried so far:

When I extract one of the column names with the names() function I do get a the name of the column:

varname  = as.name(names(Data[14])) 

But when I fill this in (and I used the attach() function):

fit <- lm(Varname ~ Age + Speed + Gender + Mass, data=Data) 

I get the following error:

Error in model.frame.default(formula = Varname ~ Age + Speed + Gender + : object is not a matrix

I suppose that the lm() function does not recognize Varname as Variable1.


回答1:


You can use lapply to loop over your variables.

fit <- lapply(Data[,c(...)], function(x) lm(x ~ Age + Speed + Gender + Mass, data = Data))

This gives you a list of your results.

The c(...) should contain your variable names as strings. Alternatively, you can choose the variables by their position in Data, like Data[,1:5].




回答2:


The problem in your case is that the formula in the lm function attempts to read the literal names of columns in the data or feed the whole vector into the regression. Therefore, to use the column name, you need to tell the formula to interpret the value of the variable varnames and incorporate it with the other variables.

# generate some data
set.seed(123)
Data <- data.frame(x = rnorm(30), y = rnorm(30), 
    Age = sample(0:90, 30), Speed = rnorm(30, 60, 10), 
    Gender = sample(c("W", "M"), 30, rep=T), Mass = rnorm(30))
varnames <- names(Data)[1:2]

# fit regressions for multiple dependent variables 
fit <- lapply(varnames, 
    FUN=function(x) lm(formula(paste(x, "~Age+Speed+Gender+Mass")), data=Data))
names(fit) <- varnames

 fit
$x

Call:
lm(formula = formula(paste(x, "~Age+Speed+Gender+Mass")), data = Data)

Coefficients:
(Intercept)          Age        Speed      GenderW         Mass  
   0.135423     0.010013    -0.010413     0.023480     0.006939  


$y

Call:
lm(formula = formula(paste(x, "~Age+Speed+Gender+Mass")), data = Data)

Coefficients:
(Intercept)          Age        Speed      GenderW         Mass  
   2.232269    -0.008035    -0.027147    -0.044456    -0.023895  


来源:https://stackoverflow.com/questions/41241806/loop-multiple-multiple-linear-regressions-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!