Parallel processing with xgboost and caret

匿名 (未验证) 提交于 2019-12-03 03:06:01

问题:

I want to parallelize the model fitting process for xgboost while using caret. From what I have seen in xgboost's documentation, the nthread parameter controls the number of threads to use while fitting the models, in the sense of, building the trees in a parallel way. Caret's train function will perform parallelization in the sense of, for example, running a process for each iteration in a k-fold CV. Is this understanding correct, if yes, is it better to:

  1. Register the number of cores (for example, with the doMC package and the registerDoMC function), set nthread=1 via caret's train function so it passes that parameter to xgboost, set allowParallel=TRUE in trainControl, and let caret handle the parallelization for the cross-validation; or
  2. Disable caret parallelization (allowParallel=FALSE and no parallel back-end registration) and set nthread to the number of physical cores, so the parallelization is contained exclusively within xgboost.

Or is there no "better" way to perform the parallelization?

Edit: I ran the code suggested by @topepo, with tuneLength = 10 and search="random", and specifying nthread=1 on the last line (otherwise I understand that xgboost will use multithreading). There are the results I got:

xgb_par[3] elapsed   283.691  just_seq[3] elapsed  276.704  mc_par[3] elapsed  89.074  just_seq[3]/mc_par[3] elapsed  3.106451  just_seq[3]/xgb_par[3] elapsed  0.9753711  xgb_par[3]/mc_par[3] elapsed  3.184891

At the end, it turned out that both for my data and for this test case, letting caret handle the parallelization was a better choice in terms of runtime.

回答1:

It is not simple to project what the best strategy would be. My (biased) thought is that you should parallelize the process that takes the longest. Here, that would be the resampling loop since an open thread/worker would invoke the model many times. The opposite approach of parallelizing the model fit will start and stop workers repeatedly and theoretically slows things down. Your mileage may vary.

I don't have OpenMP installed but there is code below to test (if you could report your results, that would be helpful).

library(caret) library(plyr) library(xgboost) library(doMC)  foo <- function(...) {   set.seed(2)   mod <- train(Class ~ ., data = dat,                 method = "xgbTree", tuneLength = 50,                ..., trControl = trainControl(search = "random"))   invisible(mod) }  set.seed(1) dat <- twoClassSim(1000)  just_seq <- system.time(foo())   ## I don't have OpenMP installed xgb_par <- system.time(foo(nthread = 5))  registerDoMC(cores=5) mc_par <- system.time(foo())

My results (without OpenMP)

> just_seq[3] elapsed  326.422  > xgb_par[3] elapsed  319.862  > mc_par[3] elapsed  102.329  >  > ## Speedups > xgb_par[3]/mc_par[3] elapsed  3.12582  > just_seq[3]/mc_par[3]  elapsed  3.189927  > just_seq[3]/xgb_par[3]  elapsed  1.020509 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!