问题
I'm building an xgboost
classification task in R using the mlr
package :
# define task
Task <- mlr::makeClassifTask(id = "classif.xgboost",
data = df,
target = "response",
weights = NULL,
positive = "yes",
check.data = TRUE,
blocking = folds)
# make a base learner
lrnBase <- makeLearner(cl = "classif.xgboost",
predict.type = "prob",
# "response" (= labels) or "prob" (= labels and probabilities)
predict.threshold = NULL
)
I have to undersample one of my classes :
lrnUnder <- makeUndersampleWrapper(learner = lrnBase, usw.rate = 0.2, usw.cl = "no")
I also have to tune some of the learner's hyperparameters:
paramSet <- makeParamSet(makeNumericParam(id = "eta", lower = 0.005, upper = 0.4),
makeIntegerParam(id = "nrounds", lower = 1, upper = 100))
tuneControl <- makeTuneControlRandom(maxit = 100)
resampin <- makeResampleDesc(method = "CV",
iters = 4L,
predict = "test")
lrnTune <- makeTuneWrapper(learner = lrnUnder,
resampling = resampin,
measures = fp,
par.set = paramSet,
control = tuneControl)
My first question is that how can I get the FINAL tuned hyper-parameters (and not tuned hyper-parametrs corresponding to each iteration of CV so not by extract
argument) ? In the mlr
tutorial I found out that I have to train
my model as follows :
mdl <- mlr::train(learner = lrnTune, task = Task)
getTuneResult(mdl)
but this does not work without a nested resampling
. So when I add this block to my code it works :
resampout.desc <- makeResampleDesc(method = "CV",
iters = length(levels(folds)),
predict = "both",
fixed = TRUE)
resampout <- makeResampleInstance(desc = resampout.desc, task = Task)
resamp <- mlr::resample(learner = lrnTune,
task = Task,
resampling = resampout, # outer
measures = f1,
models = FALSE,
extract = getTuneResult,
keep.pred = TRUE)
My second question is that, in principal, do I have to wrap my learner if I don't want to do a nested resampling (i.e evaluate the performance of my model) ? Or can I simply make a non-wrapped learner and perform my tuning using tuneParams
?
Thank you in advance for your help since I got a bit confused about the functionality of wrapped learners and the nested resampling.
回答1:
You can use tuneParams() to tune a learner and then extract the best hyperparameters as described in the tutorial (https://mlr.mlr-org.com/articles/tutorial/tune.html). You certainly don't have to wrap your learner; the point of doing this is so you can simply train a model without having to worry about what the hyperparameters are. You should do a nested resampling though as otherwise your performance estimated may be biased.
来源:https://stackoverflow.com/questions/59458529/r-mlr-get-tuned-hyperparameters-for-a-wrapped-learner