using predict with a list of lm() objects

前端 未结 6 1932
甜味超标
甜味超标 2020-12-07 23:25

I have data which I regularly run regressions on. Each \"chunk\" of data gets fit a different regression. Each state, for example, might have a different function that expla

相关标签:
6条回答
  • 2020-12-07 23:53

    A solution with just base R. The format of the output is different, but all the values are right there.

    models <- lapply(split(myData, myData$state), 'lm', formula = value ~ year)
    pred4  <- mapply('predict', models, split(newData, newData$state))
    
    0 讨论(0)
  • 2020-12-07 23:54

    You need to use mdply to supply both the model and the data to each function call:

    dataList <- dlply(newData, "state")
    
    preds <- mdply(cbind(mod = modelList, df = dataList), function(mod, df) {
      mutate(df, pred = predict(mod, newdata = df))
    })
    
    0 讨论(0)
  • 2020-12-07 23:57

    I take it the hard part is matching each state in newData to the corresponding model.

    Something like this perhaps?

    predList <- dlply(newData, "state", function(x) {
      predict(modelList[[as.character(min(x$state))]], x) 
    })
    

    Here I used a "hacky" way of extracting the corresponding state model: as.character(min(x$state))

    ...There is probably a better way?

    Output:

    > predList[1:2]
    $`50`
           1        2        3        4        5        6        7        8        9       10       11 
    5176.326 5274.907 5373.487 5472.068 5570.649 5669.229 5767.810 5866.390 5964.971 6063.551 6162.132 
    
    $`51`
          12       13       14       15       16       17       18       19       20       21       22 
    5514.825 5626.160 5737.496 5848.832 5960.167 6071.503 6182.838 6294.174 6405.510 6516.845 6628.181
    

    Or, if you want a data.frame as output:

    predData <- ddply(newData, "state", function(x) {
      y <-predict(modelList[[as.character(min(x$state))]], x)
      data.frame(id=names(y), value=c(y))
    })
    

    Output:

    head(predData)
      state id    value
    1    50  1 5176.326
    2    50  2 5274.907
    3    50  3 5373.487
    4    50  4 5472.068
    5    50  5 5570.649
    6    50  6 5669.229
    
    0 讨论(0)
  • 2020-12-07 23:59

    What is wrong with

    lapply(modelList, predict, newData)
    

    ?

    EDIT:

    Thanks for explaining what is wrong with that. How about:

    newData <- data.frame(year)
    ldply(modelList, function(model) {
      data.frame(newData, predict=predict(model, newData))
    })
    

    Iterate over the models, and apply the new data (which is the same for each state since you just did an expand.grid to create it).

    EDIT 2:

    If newData does not have the same values for year for every state as in the example, a more general approach can be used. Note that this uses the original definition of newData, not the one in the first edit.

    ldply(state, function(s) {
      nd <- newData[newData$state==s,]
      data.frame(nd, predict=predict(modelList[[as.character(s)]], nd))
    })
    

    First 15 lines of this output:

       year state  predict
    1    50    50 5176.326
    2    51    50 5274.907
    3    52    50 5373.487
    4    53    50 5472.068
    5    54    50 5570.649
    6    55    50 5669.229
    7    56    50 5767.810
    8    57    50 5866.390
    9    58    50 5964.971
    10   59    50 6063.551
    11   60    50 6162.132
    12   50    51 5514.825
    13   51    51 5626.160
    14   52    51 5737.496
    15   53    51 5848.832
    
    0 讨论(0)
  • 2020-12-08 00:06

    Here's my attempt:

    predNaughty <- ddply(newData, "state", transform,
      value=predict(modelList[[paste(piece$state[1])]], newdata=piece))
    head(predNaughty)
    #   year state    value
    # 1   50    50 5176.326
    # 2   51    50 5274.907
    # 3   52    50 5373.487
    # 4   53    50 5472.068
    # 5   54    50 5570.649
    # 6   55    50 5669.229
    predDiggsApproved <- ddply(newData, "state", function(x)
      transform(x, value=predict(modelList[[paste(x$state[1])]], newdata=x)))
    head(predDiggsApproved)
    #   year state    value
    # 1   50    50 5176.326
    # 2   51    50 5274.907
    # 3   52    50 5373.487
    # 4   53    50 5472.068
    # 5   54    50 5570.649
    # 6   55    50 5669.229
    

    JD Long edit

    I was inspired enough to work out an adply() option:

    pred3 <- adply(newData, 1,  function(x)
        predict(modelList[[paste(x$state)]], newdata=x))
    head(pred3)
    #   year state        1
    # 1   50    50 5176.326
    # 2   51    50 5274.907
    # 3   52    50 5373.487
    # 4   53    50 5472.068
    # 5   54    50 5570.649
    # 6   55    50 5669.229
    
    0 讨论(0)
  • 2020-12-08 00:08

    Maybe I'm missing something, but I believe lmList is the ideal tool here,

    library(nlme)
    ll = lmList(value ~ year | state, data=myData)
    predict(ll, newData)
    
    
    ## Or, to show that it produces the same results as the other proposed methods...
    newData[["value"]] <- predict(ll, newData)
    head(newData)
    #   year state    value
    # 1   50    50 5176.326
    # 2   51    50 5274.907
    # 3   52    50 5373.487
    # 4   53    50 5472.068
    # 5   54    50 5570.649
    # 6   55    50 5669.229
    
    0 讨论(0)
提交回复
热议问题