R Loop for Variable Names to run linear regression model

后端 未结 2 805
不思量自难忘°
不思量自难忘° 2020-12-06 15:38

First off, I am pretty new to this so my method/thinking may be wrong, I have imported a xlsx data set into a data frame using R and R studio. I want to be able to loop thro

相关标签:
2条回答
  • 2020-12-06 16:09

    Ok, I'll post an answer. I will use the dataset mtcarsas an example. I believe it will work with your dataset.
    First, I create a store, lm.test, an object of class list. In your code you are assigning the output of lm(.) every time through the loop and in the end you would only have the last one, all others would have been rewriten by the newer ones.
    Then, inside the loop, I use function reformulate to put together the regression formula. There are other ways of doing this but this one is simple.

    # Use just some columns
    data <- mtcars[, c("mpg", "cyl", "disp", "hp", "drat", "wt")]
    col10 <- names(data)[-1]
    
    lm.test <- vector("list", length(col10))
    
    for(i in seq_along(col10)){
        lm.test[[i]] <- lm(reformulate(col10[i], "mpg"), data = data)
    }
    
    lm.test
    

    Now you can use the results list for all sorts of things. I suggest you start using lapply and friends for that.
    For instance, to extract the coefficients:

    cfs <- lapply(lm.test, coef)
    

    In order to get the summaries:

    smry <- lapply(lm.test, summary)
    

    It becomes very simple once you're familiar with *apply functions.

    0 讨论(0)
  • 2020-12-06 16:15

    You can create a temporary subset in which you select only the columns used in your regression. This way, you won't need to inject the temporary name in the formula.

    Sticking up to your code, this should do the trick.

    for(i in 1:length(col10)){
     tempSubset <- data[,c("Total_Transactions", col10[i]]
     lm.test <- lm(Total_Transactions ~ ., data = tempSubset)
     i + 1
    }
    
    0 讨论(0)
提交回复
热议问题