How to succinctly write a formula with many variables from a data frame?

后端 未结 6 2135
一整个雨季
一整个雨季 2020-11-22 17:01

Suppose I have a response variable and a data containing three covariates (as a toy example):

y = c(1,4,6)
d = data.frame(x1 = c(4,-1,3), x2 = c(3,9,8), x3 =         


        
6条回答
  •  迷失自我
    2020-11-22 17:55

    I build this solution, reformulate does not take care if variable names have white spaces.

    add_backticks = function(x) {
        paste0("`", x, "`")
    }
    
    x_lm_formula = function(x) {
        paste(add_backticks(x), collapse = " + ")
    }
    
    build_lm_formula = function(x, y){
        if (length(y)>1){
            stop("y needs to be just one variable")
        }
        as.formula(        
            paste0("`",y,"`", " ~ ", x_lm_formula(x))
        )
    }
    
    # Example
    df <- data.frame(
        y = c(1,4,6), 
        x1 = c(4,-1,3), 
        x2 = c(3,9,8), 
        x3 = c(4,-4,-2)
        )
    
    # Model Specification
    columns = colnames(df)
    y_cols = columns[1]
    x_cols = columns[2:length(columns)]
    formula = build_lm_formula(x_cols, y_cols)
    formula
    # output
    # "`y` ~ `x1` + `x2` + `x3`"
    
    # Run Model
    lm(formula = formula, data = df)
    # output
    Call:
        lm(formula = formula, data = df)
    
    Coefficients:
        (Intercept)           x1           x2           x3  
            -5.6316       0.7895       1.1579           NA  
    

    ```

提交回复
热议问题