Export variable in foreach

被刻印的时光 ゝ 提交于 2021-01-29 03:53:51

问题


I am having trouble exporting a data frame to %dopar% in foreach package. It works if I use %do% together with registerDoSEQ(), but with registerDoParallel() I always get:

Error in { : task 1 failed - "object 'kyphosis' not found"

Here is a reproducible example using kyphosis data from rpart package. I am trying to parallelize stepwise regression a little:

library(doParallel)
library(foreach)
library(rpart)

invars <- c('Age', 'Number', 'Start')
n_vars <- 2
vars <- length(invars)
iter <- trunc(vars/n_vars)
threads <- 4
if (vars%%n_vars == 0) iter <- iter - 1
iter <- 0:iter

cl <- makeCluster(threads)
registerDoParallel(cl)
#registerDoSEQ()

terms <- ''
min_formula <- paste0('Kyphosis~ 1', terms)
fit <- glm(formula = as.formula(min_formula), data = kyphosis, family = 'binomial')

out <- foreach(x = iter, .export = 'kyphosis') %dopar%  {

  nv <- invars[(x * n_vars + 1):(min(x * n_vars + n_vars, vars))]
  sfit <- step(object = fit, trace =FALSE, scope = list(
    lower = min_formula,
    upper = as.formula(paste(min_formula, '+', paste0(nv, collapse = '+')))),
    steps = 1, direction = 'forward')
  aic <- sfit$aic

  names(aic) <- if(nrow(sfit$anova) == 2) sfit$anova$Step[2]
  aic
}
out
stopCluster(cl)

回答1:


Add this in the body of foreach before calling step function:

.GlobalEnv$kyphosis <- kyphosis

I'm not sure why this happens, but my intuion is that step calls glm inside itself using information stored in fit$call, which is

glm(formula = as.formula(min_formula), family = "binomial", data = kyphosis)

with new updated formula, but the part data = kyphosis remains the same. So glm tries to look for kyphosis in the global environment.



来源:https://stackoverflow.com/questions/33022388/export-variable-in-foreach

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!