doRedis/foreach GBM parallel processing error in R

人走茶凉 提交于 2021-01-28 15:09:49

问题


I am running a gbm model using the caret package and trying to get it working using parallel processing with the doredis package. I can get the backend workers all up and running, but am having issues when they recombine into the final model. I am getting this error:

    Error in foreach(j = 1:12, .combine = sum, .multicombine = TRUE) %dopar%  : 
      target of assignment expands to non-language object

This is my first time trying to run the foreach loop (let alone on a complex problem like gbm) and am having issues trying to understand and get this implemented. I have done many Google searches and found nothing on implementing foreach with gbm, Any help on understanding foreach would be greatly appreciated. Here is my code:

    set.seed(825)
    library(caret)
    require(foreign)

    data <- read.spss("C:\\Users\\cc\\Documents\\mydata.sav",use.value.labels=TRUE, to.data.frame = TRUE)
    getOption("max.print")
    options(max.print = 99999999)
    set.seed(825)
    start.time <- Sys.time()
    x <- data[, -162]
    y <- data[, 162]
    fitControl = trainControl(method = "cv", number = 8,  allowParallel=TRUE)
     gbmGrid <-  expand.grid(interaction.depth = c(49), n.trees = (1:2), shrinkage = c(0.03), n.minobsinnode = 50)


    require(doRedis)
    registerDoRedis('jobs')
    options('redis:num'=TRUE)
    foreach(j=1:12,.combine=sum,.multicombine=TRUE) %dopar%

    gbmFit <- train(x=x,y=y,"gbm", tuneGrid = gbmGrid, trControl=fitControl)
    gbmFit
    summary(gbmFit)

    end.time <- Sys.time()
    time.taken <- end.time - start.time
    time.taken

UPDATE As per a suggestion regarding reproducing with some sort of dataset, I switched mydata with the Iris dataset data <- iris and changed the X and Y to x <- data[, -5] y <- data[, 5] and the same error occurred.


回答1:


I found an answer to this! I got in touch with the creater of redis, which in turn he got in touch with the creator of caret. It seems caret automatically handles splitting jobs up so the foreach loop is not necessary. Just completely remove that line and it will work perfectly.

On a side note, he directed me to tell others to go to the gitihub to download the newest doRedis package as it is better than the current doredis package, but not quit ready for CRAN.

Use this code to install the new doRedis package (note make sure you also have Rtools installed before you run the code)

install.packages("devtools")
devtools::install_github("bwlewis/doRedis")


来源:https://stackoverflow.com/questions/32971288/doredis-foreach-gbm-parallel-processing-error-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!