How to use custom cross validation folds with XGBoost

后端 未结 3 1196
栀梦
栀梦 2020-12-18 13:37

I\'m using the R wrapper for XGBoost. In the function xgb.cv, there is a folds parameter with the description

list provides a po

相关标签:
3条回答
  • 2020-12-18 13:40

    Through some trial and error I figured out that xgboost is using the passed indices as indices of the test folds. Confirmed this by noticing the current devel version of xgboost explicitly states it in the documentation.

    0 讨论(0)
  • 2020-12-18 13:43

    Here is an example for both generating the folds and using them.

    Assume in our dataframe we have a column of ids, such that we want to put all rows with a given id value in a fold.

    The code below

    • finds the unique ids
    • preallocates a list for the folds
    • iterates over ids, creating lists of row indices that match

      fold.ids <- unique(df$id) custom.folds <- vector("list", length(fold.ids)) i <- 1 for( id in fold.ids){ custom.folds[[i]] <- which( df$id %in% id ) i <- i+1 }

    Here is an example using the above fold list in xgb.cv

    res <- xgb.cv(param, dtrain, nround, folds=custom.folds, prediction = TRUE)

    Reasonable values for other xgb.cv parameters can be found in the documentation

    0 讨论(0)
  • 2020-12-18 13:54

    This worked best for me:

    custom.folds <- caret::createFolds(data$Label, k=10, list=T)
    
    xgbcv <- xgb.cv(
      params = params
      ,data = df
      ,maximize = F
      ,prediction = T
      ,metrics = "logloss"
      ,folds = custom.folds
    )
    
    0 讨论(0)
提交回复
热议问题