How to succinctly write a formula with many variables from a data frame?

后端 未结 6 2167
一整个雨季
一整个雨季 2020-11-22 17:01

Suppose I have a response variable and a data containing three covariates (as a toy example):

y = c(1,4,6)
d = data.frame(x1 = c(4,-1,3), x2 = c(3,9,8), x3 =         


        
6条回答
  •  再見小時候
    2020-11-22 17:42

    An extension of juba's method is to use reformulate, a function which is explicitly designed for such a task.

    ## Create a formula for a model with a large number of variables:
    xnam <- paste("x", 1:25, sep="")
    
    reformulate(xnam, "y")
    y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 + x11 + 
        x12 + x13 + x14 + x15 + x16 + x17 + x18 + x19 + x20 + x21 + 
        x22 + x23 + x24 + x25
    

    For the example in the OP, the easiest solution here would be

    # add y variable to data.frame d
    d <- cbind(y, d)
    reformulate(names(d)[-1], names(d[1]))
    y ~ x1 + x2 + x3
    

    or

    mod <- lm(reformulate(names(d)[-1], names(d[1])), data=d)
    

    Note that adding the dependent variable to the data.frame in d <- cbind(y, d) is preferred not only because it allows for the use of reformulate, but also because it allows for future use of the lm object in functions like predict.

提交回复
热议问题