caret::train: specify model-generation-parameters

放肆的年华 提交于 2019-11-29 04:01:07

问题


I'm using the caret library in R for model generation. I want to generate an earth (aka MARS) model and I want to specify the degree parameter for this model generation. According to the documentation (page 11) the earth method supports this parameter.

I get the following error message when specifying the parameter:

> library(caret)
> data(trees)
> train(Volume~Girth+Height, data=trees, method='earth', degree=1)
Error in { : 
  task 1 failed - "formal argument "degree" matched by multiple actual arguments"

How can I avoid this error when specifying the degree parameter?

> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] earth_3.2-3    plotrix_3.4    plotmo_1.3-1   leaps_2.9      caret_5.15-023
 [6] foreach_1.4.0  cluster_1.14.2 reshape_0.8.4  plyr_1.7.1     lattice_0.20-6

loaded via a namespace (and not attached):
[1] codetools_0.2-8 compiler_2.15.0 grid_2.15.0     iterators_1.0.6
[5] tools_2.15.0   

回答1:


I have always found the functions in caret both useful and somewhat maddening. Here's what's going on.

You're attempting to pass an argument to earth via the ... argument to train. The documentation for train contains this description for that argument:

arguments passed to the classification or regression routine (such as randomForest). Errors will occur if values for tuning parameters are passed here.

Tuning parameter, eh? Well, if you scroll down and examine the official list of tuning parameters for each model type, you'll see that for earth, they are degree and nprune.

So the issue here is that train is designed to automate some grid searching along tuning parameters, and the ... argument is to be used for passing further arguments to the model fitting function except for those tuning parameters.

If you want to set the tuning parameters you'll need to use other arguments, like so:

train(Volume~Girth+Height, data=trees, method='earth',
      tuneGrid = data.frame(.degree = 1,.nprune = 5))

Note how the columns are named with leading periods. Also, it is frustrating that since the default value in earth for nprune is NULL, I'm not sure you can pass only the default values in this way. (Generally, setting things to NULL in data frames will simply remove them.)




回答2:


I found out how to do it, joran led me into the right direction:

Create a new function which generates the training grid. This function must accept the two parameters len and data. In order to retrieve the original training grid, you can call the createGrid method provided by the caret package. You can then modify the grid to your needs. For example to neave the nprune parameter unchanged and add degree from 1 to 5 use the following code:

  createMARSGrid <- function(len, data) {
      g = createGrid("earth", len, data)
      g = expand.grid(.nprune=g$.nprune, .degree=seq(1,5))
      return(g)
  }   

Then invoke it like this:

train(formula, data=data, method='earth', tuneGrid = createMARSGrid)


来源:https://stackoverflow.com/questions/10498477/carettrain-specify-model-generation-parameters

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!