问题
I'm using the caret
library in R for model generation. I want to generate an earth
(aka MARS) model and I want to specify the degree
parameter for this model generation. According to the documentation (page 11) the earth
method supports this parameter.
I get the following error message when specifying the parameter:
> library(caret)
> data(trees)
> train(Volume~Girth+Height, data=trees, method='earth', degree=1)
Error in { :
task 1 failed - "formal argument "degree" matched by multiple actual arguments"
How can I avoid this error when specifying the degree
parameter?
> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] earth_3.2-3 plotrix_3.4 plotmo_1.3-1 leaps_2.9 caret_5.15-023
[6] foreach_1.4.0 cluster_1.14.2 reshape_0.8.4 plyr_1.7.1 lattice_0.20-6
loaded via a namespace (and not attached):
[1] codetools_0.2-8 compiler_2.15.0 grid_2.15.0 iterators_1.0.6
[5] tools_2.15.0
回答1:
I have always found the functions in caret both useful and somewhat maddening. Here's what's going on.
You're attempting to pass an argument to earth
via the ...
argument to train
. The documentation for train
contains this description for that argument:
arguments passed to the classification or regression routine (such as randomForest). Errors will occur if values for tuning parameters are passed here.
Tuning parameter, eh? Well, if you scroll down and examine the official list of tuning parameters for each model type, you'll see that for earth
, they are degree
and nprune
.
So the issue here is that train
is designed to automate some grid searching along tuning parameters, and the ...
argument is to be used for passing further arguments to the model fitting function except for those tuning parameters.
If you want to set the tuning parameters you'll need to use other arguments, like so:
train(Volume~Girth+Height, data=trees, method='earth',
tuneGrid = data.frame(.degree = 1,.nprune = 5))
Note how the columns are named with leading periods. Also, it is frustrating that since the default value in earth
for nprune
is NULL
, I'm not sure you can pass only the default values in this way. (Generally, setting things to NULL
in data frames will simply remove them.)
回答2:
I found out how to do it, joran led me into the right direction:
Create a new function which generates the training grid. This function must accept the two parameters len
and data
. In order to retrieve the original training grid, you can call the createGrid
method provided by the caret
package. You can then modify the grid to your needs. For example to neave the nprune
parameter unchanged and add degree
from 1 to 5 use the following code:
createMARSGrid <- function(len, data) {
g = createGrid("earth", len, data)
g = expand.grid(.nprune=g$.nprune, .degree=seq(1,5))
return(g)
}
Then invoke it like this:
train(formula, data=data, method='earth', tuneGrid = createMARSGrid)
来源:https://stackoverflow.com/questions/10498477/carettrain-specify-model-generation-parameters