Add Column of Predicted Values to Data Frame with dplyr

后端 未结 4 1007
太阳男子
太阳男子 2020-12-14 23:47

I have a data frame with a column of models and I am trying to add a column of predicted values to it. A minimal example is :

exampleTable <- data.frame(x          


        
4条回答
  •  星月不相逢
    2020-12-15 00:19

    Using modelr, there is an elegant solution using the tidyverse.

    The inputs

    library(dplyr)
    library(purrr)
    library(tidyr)
    
    # generate the inputs like in the question
    example_table <- data.frame(x = c(1:5, 1:5),
                                y = c((1:5) + rnorm(5), 2*(5:1)),
                                groups = rep(LETTERS[1:2], each = 5))
    
    models <- example_table %>% 
      group_by(groups) %>% 
      do(model = lm(y ~ x, data = .)) %>%
      ungroup()
    example_table <- left_join(tbl_df(example_table ), models, by = "groups")
    

    The solution

    # generate the extra column
    example_table %>%
      group_by(groups) %>%
      do(modelr::add_predictions(., first(.$model)))
    

    The explanation

    add_predictions adds a new column to a data frame using a given model. Unfortunately it only takes one model as an argument. Meet do. Using do, we can run add_prediction individually over each group.

    . represents the grouped data frame, .$model the model column and first() takes the first model of each group.

    Simplified

    With only one model, add_predictions works very well.

    # take one of the models
    model <- example_table$model[[6]]
    
    # generate the extra column
    example_table %>%
      modelr::add_predictions(model)
    

    Recipes

    Nowadays, the tidyverse is shifting from the modelr package to recipes so that might be the new way to go once this package matures.

提交回复
热议问题