How to succinctly write a formula with many variables from a data frame?

后端 未结 6 2166
一整个雨季
一整个雨季 2020-11-22 17:01

Suppose I have a response variable and a data containing three covariates (as a toy example):

y = c(1,4,6)
d = data.frame(x1 = c(4,-1,3), x2 = c(3,9,8), x3 =         


        
6条回答
  •  清歌不尽
    2020-11-22 17:58

    There is a special identifier that one can use in a formula to mean all the variables, it is the . identifier.

    y <- c(1,4,6)
    d <- data.frame(y = y, x1 = c(4,-1,3), x2 = c(3,9,8), x3 = c(4,-4,-2))
    mod <- lm(y ~ ., data = d)
    

    You can also do things like this, to use all variables but one (in this case x3 is excluded):

    mod <- lm(y ~ . - x3, data = d)
    

    Technically, . means all variables not already mentioned in the formula. For example

    lm(y ~ x1 * x2 + ., data = d)
    

    where . would only reference x3 as x1 and x2 are already in the formula.

提交回复
热议问题