h2o.glm lambda search not appearing to iterate over all lambdas

后端 未结 2 485
离开以前
离开以前 2021-01-14 02:35

Please consider the following basic reproducible example:

library(h2o)
h2o.init()
data(\"iris\")
iris.hex = as.h2o(iris, \"iris.hex\")
mod = h2o.glm(y = \"Se         


        
2条回答
  •  春和景丽
    2021-01-14 03:10

    What is happening is it learns from the cross-validation models, to optimize the parameters used for the final run. (BTW, you are using nfolds=2 which is fairly unusual for a small data set: learn on just 75 records, then test on the other 75. So you are going to have a lot of noise in what it learns from CV.)

    Following on from your code:

    tail(mod@allparameters$lambda)
    mod@model$lambda_best
    

    I'm using 3.14.0.1, so here is what I get:

    [1] 0.002129615 0.001940426 0.001768044 0.001610975 0.001467861 0.001337460
    

    and:

    [1] 0.001610975
    

    Then if we go look at the same for the 2 CV models:

    lapply(mod@model$cross_validation_models, function(m_cv){
      m <- h2o.getModel(m_cv$name)
      list( tail(m@allparameters$lambda), m@model$lambda_best )
      })
    

    I get:

    [[1]]
    [[1]][[1]]
    [1] 0.0002283516 0.0002080655 0.0001895815 0.0001727396 0.0001573939 0.0001434115
    
    [[1]][[2]]
    [1] 0.002337249
    
    
    [[2]]
    [[2]][[1]]
    [1] 0.0002283516 0.0002080655 0.0001895815 0.0001727396 0.0001573939 0.0001434115
    
    [[2]][[2]]
    [1] 0.00133746
    

    I.e. it seems the lowest best lambda found in the CV models was 0.00133, so it has used that as early stopping for the final model.

    BTW, if you poke around in those cv models you will see they both tried 100 values for lambda. It is only the final model that does the extra optimization.

    (I'm thinking of it as a time optimization, but reading p.26/27 of the Generalized Linear Models booklet (free download from https://www.h2o.ai/resources/), I think it is mainly about using the cv data to avoid over-fitting.)

    You can explicitly specify a set of lambda values to try. BUT, the cross-validation learning will still take priority for the final model. E.g. in the following the final model only tried the first 4 of the 6 lambda values I suggested, because both CV models liked 0.001 best.

    mx = h2o.glm(y = "Sepal.Length", x = setdiff(colnames(iris), "Sepal.Length"), 
                training_frame = iris.hex, nfolds = 2, seed = 100,
                lambda = c(1.0, 0.1, 0.01, 0.001, 0.0001, 0), lambda_search = T,
                family = "gamma")
    
    tail(mx@allparameters$lambda)
    mx@model$lambda_best
    
    lapply(mx@model$cross_validation_models, function(m_cv){
      m <- h2o.getModel(m_cv$name)
      list( tail(m@allparameters$lambda), m@model$lambda_best )
    })
    

提交回复
热议问题