mlr

R - mlr: Is there a easy way to get the variable importance of tuned support vector machine models in nested resampling (spatial)?

烂漫一生 提交于 2021-02-09 11:46:24
问题 I am trying to get the variable importance for all predictors (or variables, or features) of a tuned support vector machine (svm) model using e1071::svm through the mlr -package in R . But I am not sure, if I am doing the assessment right. Well, at first the idea: To get an honest tuned svm-model, I am following the nested-resampling tutorial using spatial n-fold cross-validation ( SpRepCV ) in the outer loop and spatial cross-validation ( SpCV ) in the inner loop. As tuning parameter gamma

Parallelization on resampling within a stacked learner (ensemble/stack of classification learners) doesn't work

好久不见. 提交于 2021-01-29 06:44:31
问题 The below code works fine, however, I am interested to run it in parallel. I have tried different plans within future and future.apply but couldn't managed. Any help appreciated. I am running on windows OS, 8 cores. library(mlr3verse) library(future.apply) #> Warning: package 'future.apply' was built under R version 3.6.3 #> Loading required package: future #> Warning: package 'future' was built under R version 3.6.3 library(future) future::plan(multicore) tsk_clf = tsk("sonar") tsk_clf$col

mlr: retrieve output of generateFilterValuesData within CV loop

大城市里の小女人 提交于 2021-01-28 06:44:12
问题 If I fuse a learner with a filter method using makeFilterWrapper, then I know I can perform feature selection using that filter within a cross-validation loop. As I understand it, filterFeatures is called before each model fit and it calls generateFilterValuesData. But is it possible to retrieve the values generated by generateFilterValuesData, using that filter, within each iteration of cross validation? For example: library(survival) library(mlr) data(veteran) set.seed(24601) configureMlr

MLR random forest multi label get feature importance

我与影子孤独终老i 提交于 2020-05-29 09:42:34
问题 I am using multilabel.randomForestSRC learner from mlr package for a multi-label classification problem I would like to return the variables importances The getFeatureImportance function return this issue : code: getFeatureImportance(mod) Error: Error in checkLearner(object$learner, props = "featimp") : Learner 'multilabel.randomForestSRC' must support properties 'featimp', but does not support featimp' 回答1: You can use extract the variable importance using randomForestSRC::vimp , using the

MLR random forest multi label get feature importance

北战南征 提交于 2020-05-29 09:42:32
问题 I am using multilabel.randomForestSRC learner from mlr package for a multi-label classification problem I would like to return the variables importances The getFeatureImportance function return this issue : code: getFeatureImportance(mod) Error: Error in checkLearner(object$learner, props = "featimp") : Learner 'multilabel.randomForestSRC' must support properties 'featimp', but does not support featimp' 回答1: You can use extract the variable importance using randomForestSRC::vimp , using the

MLR: How to compute permuted feature importance for sequential MBO parametrized models?

房东的猫 提交于 2020-01-16 07:49:31
问题 I am doing nested cross-validation using the packages mlr and mlrMBO. The inner CV is used for parametrization (e.g. to find the optimal parameters). Since I want to compare the performance of different learners, I conduct a benchmark experiment using mlr's benchmark function. My question is the following: Is it possible to permute on the parametrized model/learner? When I call generateFeatureImportanceData on the learner I use in the benchmark experiment, the model is estimated again

R: How to use parallelMap (with mlr, xgboost) on linux server? Unexpected performance compared to windows

廉价感情. 提交于 2019-12-24 16:58:17
问题 I am trying to parallelize at the tuning hyperparameter level an xgboost model that I am tuning in mlr and am trying to parallelize with parallelMap . I have code that works successfully on my windows machine (with only 8 cores) and would like to make use of a linux server (with 72 cores). I have not been able to successfully gain any computational advantage moving to the server, and I think this is a result of holes in my understanding of the parallelMap parameters. I do not understand the

MLR - Survival analysis with time-dependent data

独自空忆成欢 提交于 2019-12-24 14:35:05
问题 I am using mlr and would like to be able to use the extended version of the Cox PH model for right-censored, time dependent covariates. This is what I have tried, following the vignette on time-dependent covariates https://cran.microsoft.com/web/packages/survival/vignettes/timedep.pdf (section 3.4): library(survival) library(mlr) data(pbc) temp <- subset(pbc, id <= 312, select=c(id:sex, stage)) # baseline pbc2 <- tmerge(temp, temp, id=id, death = event(time, status)) #set range pbc2 <- tmerge