问题
Trying to better understand how train(tuneLength = ) works in {caret}. My confusion happened when trying to understand some of the differences between the SVM methods from {kernlab} I've reviewed the documentation (here) and the caret training page (here).
My toy example was creating five models using the iris dataset. Results are here, and reproducible code is here (they're rather long so I didn't copy and paste them into the post).
From the {caret} documentation:
tuneLength
an integer denoting the amount of granularity in the tuning parameter grid. By default, this argument is the number of levels for each tuning parameters that should be generated by train. If trainControl has the option search = "random", this is the maximum number of tuning parameter combinations that will be generated by the random search. (NOTE: If given, this argument must be named.)
In this example, trainControl(search = "random") and train(tuneLength = 30), but there appears to be 67 results, not 30 (the maximum number of tuning parameter combinations)? I tried playing around to see if maybe there were 30 unique ROC values, or even ydim values, but by my count they're not.
For the toy example, I created the following table:
Is there a way to see what's going on "under the hood"? For instance, M1 (svmRadial) and M3 (svmRadialSigma) both take, and are given, the same tune parameters, but based on calling $results appear to use them differently?
My understanding of train(tuneLength = 9) was that both models would produce results of sigma and C each with 9 values, 9 times since 9 is the number of levels for each tuning parameter (the exception being random search)? Similarly, M4 would be 9^3 since train(tuneLength = 9) and there are 3 tuning parameters?
Michael
回答1:
I need to update the package documentation more but the details are spelled on on the package web page for random search:
"The total number of unique combinations is specified by the
tuneLengthoption totrain."
However, this is particularly muddy SVMs using the RBF kernel. Here is a run down:
svmRadialtunes over cost and uses a single value ofsigmabased onkern lab'ssigestfunction. For grid search,tuneLengthis the number of cost values to test and for random search it is the total number of (cost,sigma) pairs to evaluate.svmRadialCostis the same assvmRadialbutsigestis run inside of each resampling loop. For random, search, it does not tune oversigma.svmRadialSigmawith grid search tunes over both cost andsigma. In a moment of sub-optimal cognitive performance, I set this up to try at most 6 values ofsigmaduring grid search since I felt that cost space needed a wider range. For random search it does the same assvmRadial.svmRadialWeightis the same assvmRadialbut also considered class weights and is for 2-class problems only.
As for the SOM example on the webpage, well that's a bug. I over-sample the SOM parameter space since there needs to be a filter for xdim <= ydim & xdim*ydim < nrow(x). The bug is from me not keeping the right amount of parameters.
来源:https://stackoverflow.com/questions/38859705/r-understanding-caret-traintunelength-and-svm-methods-from-kernlab