mlr

R-MLR : get tuned hyperparameters for a wrapped learner

走远了吗. 提交于 2019-12-24 06:17:13
问题 I'm building an xgboost classification task in R using the mlr package : # define task Task <- mlr::makeClassifTask(id = "classif.xgboost", data = df, target = "response", weights = NULL, positive = "yes", check.data = TRUE, blocking = folds) # make a base learner lrnBase <- makeLearner(cl = "classif.xgboost", predict.type = "prob", # "response" (= labels) or "prob" (= labels and probabilities) predict.threshold = NULL ) I have to undersample one of my classes : lrnUnder <-

Convert predicted probabilities after downsampling to actual probabilities in classification (using mlr)

ε祈祈猫儿з 提交于 2019-12-12 08:54:27
问题 If I use undersampling in case of an unbalanced binary target variable to train a model, the prediction method calculates probabilities under the assumption of a balanced data set. How can I convert these probabilities to actual probabilities for the unbalanced data? Is the a conversion argument/function implemented in the mlr package or another package? For example: a <- data.frame(y=factor(sample(0:1, prob = c(0.1,0.9), replace=T, size=100))) a$x <- as.numeric(a$y)+rnorm(n=100, sd=1) task <

Get predictions on test sets in MLR

天大地大妈咪最大 提交于 2019-12-12 01:24:40
问题 I'm fitting classification models for binary issues using MLR package in R. For each model, I perform a cross-validation with embedded feature selection using "selectFeatures" function and retrieve mean AUCs over test sets. I would like next to retrieve predictions on the test sets for each fold but this function does not seem to support that. I already tried to plug selected predictors into the "resample" function to get it. It works, but performance metrics are different which is not

Why does mlr give different results in different runs even when using set.seed()?

血红的双手。 提交于 2019-12-12 00:14:12
问题 To publish reproducible results obtained in the mlr package one should use the set.seed() function to control the randomness of the code. Testing, it seems such practice doesn't lead to the desired results, in which different runs of the code give slightly different outputs, such as reported in the source of this question and following code. Here's some reproducible code ## libraries library(mlr) library(parallel) library(parallelMap) ## options set.seed(1) cv.n <- 3 bag.n <- 3 ## data

Multi-label text classification using mlr package in R

自古美人都是妖i 提交于 2019-12-11 12:46:55
问题 I need to train a model which would perform multilabel multiclass classification on text data. I am presently trying to do the same using mlr package in R , following the directions in this link - Multilabel Classification (using mlr R package) 1) Is there any other package recommended? 2) Otherwise, I am stuck at this place (as instructed in the article mentioned above) classify <- getTaskData(dtmDf) ## dtmDf is my dtm converted to dataframe form 'classify' is NULL Any help/ directions would

Change color and legend of plotLearnerPrediction ggplot2 object

不想你离开。 提交于 2019-12-11 09:05:43
问题 I've been producing a number of nice plots with the plotLearnerPrediction function in the mlr package for R. They look like this. From looking into the source code of the plotLearnerPrediction function it looks like the color surfaces are made with geom_tile. A plot can for example be made by: library(mlr) data(iris) #make a learner lrn <- "classif.qda" #make a task my.task <- makeClassifTask(data = iris, target = "Species") #make plot plotLearnerPrediction(learner = lrn, task = my.task) Now

How can a blocking factor be included in makeClassifTask() from mlr package?

≯℡__Kan透↙ 提交于 2019-12-06 08:20:32
问题 In some classification tasks, using mlr package, I need to deal with a data.frame similar to this one: set.seed(pi) # Dummy data frame df <- data.frame( # Repeated values ID ID = sort(sample(c(0:20), 100, replace = TRUE)), # Some variables X1 = runif(10, 1, 10), # Some Label Label = sample(c(0,1), 100, replace = TRUE) ) df I need to cross-validate the model keeping together the values with the same ID , I know from the tutorial that: https://mlr-org.github.io/mlr-tutorial/release/html/task