How to setup workers for parallel processing in R using snowfall and multiple Windows nodes?

北战南征 提交于 2019-12-20 09:36:51

问题


I’ve successfully used snowfall to setup a cluster on a single server with 16 processors.

require(snowfall)
if (sfIsRunning() == TRUE) sfStop()

number.of.cpus <- 15
sfInit(parallel = TRUE, cpus = number.of.cpus)
stopifnot( sfCpus() == number.of.cpus )
stopifnot( sfParallel() == TRUE )

# Print the hostname for each cluster member
sayhello <- function()
{
    info <- Sys.info()[c("nodename", "machine")]
    paste("Hello from", info[1], "with CPU type", info[2])
}
names <- sfClusterCall(sayhello)
print(unlist(names))

Now, I am looking for complete instructions on how to move to a distributed model. I have 4 different Windows machines with a total of 16 cores that I would like to use for a 16 node cluster. So far, I understand that I could manually setup a SOCK connection or leverage MPI. While it appears possible, I haven’t found clear and complete directions as to how.

The SOCK route appears to depend on code in a snowlib script. I can generate a stub from the master side with the following code:

winOptions <-
    list(host="172.01.01.03",
         rscript="C:/Program Files/R/R-2.7.1/bin/Rscript.exe",
         snowlib="C:/Rlibs")

cl <- makeCluster(c(rep(list(winOptions), 2)), type = "SOCK", manual = T)

It yields the following:

Manually start worker on 172.01.01.03 with
     "C:/Program Files/R/R-2.7.1/bin/Rscript.exe"
      C:/Rlibs/snow/RSOCKnode.R
      MASTER=Worker02 PORT=11204 OUT=/dev/null SNOWLIB=C:/Rlibs

It feels like a reasonable start. I found code for RSOCKnode.R on GitHub under the snow package:

local({
    master <- "localhost"
    port <- ""
    snowlib <- Sys.getenv("R_SNOW_LIB")
    outfile <- Sys.getenv("R_SNOW_OUTFILE") ##**** defaults to ""; document

    args <- commandArgs()
    pos <- match("--args", args)
    args <- args[-(1 : pos)]
    for (a in args) {
        pos <- regexpr("=", a)
        name <- substr(a, 1, pos - 1)
        value <- substr(a,pos + 1, nchar(a))
        switch(name,
               MASTER = master <- value,
               PORT = port <- value,
               SNOWLIB = snowlib <- value,
               OUT = outfile <- value)
    }

    if (! (snowlib %in% .libPaths()))
        .libPaths(c(snowlib, .libPaths()))
    library(methods) ## because Rscript as of R 2.7.0 doesn't load methods
    library(snow)

    if (port == "") port <- getClusterOption("port")

    sinkWorkerOutput(outfile)
    cat("starting worker for", paste(master, port, sep = ":"), "\n")
    slaveLoop(makeSOCKmaster(master, port))
})

It’s not clear how to actually start a SOCK listener on the workers, unless it is buried in snow::recvData.

Looking into the MPI route, as far as I can tell, Microsoft MPI version 7 is a starting point. However, I could not find a Windows alternative for sfCluster. I was able to start the MPI service, but it does not appear to listen on port 22 and no amount of bashing against it with snowfall::makeCluster has yielded a result. I’ve disabled the firewall and tried testing with makeCluster and directly connecting to the worker from the master with PuTTY.


Is there a comprehensive, step-by-step guide to setting up a snowfall cluster on Windows workers that I’ve missed? I am fond of snowfall::sfClusterApplyLB and would like to continue using that, but if there is an easier solution, I’d be willing to change course. Looking into Rmpi and parallel, I found alternative solutions for the master side of the work, but still little to no specific detail on how to setup workers running Windows.

Due to the nature of the work environment, neither moving to AWS, nor Linux is an option.

Related questions without definitive answers for Windows worker nodes:

  • How to set up cluster slave nodes (on Windows)
  • Parallel R on a Windows cluster
  • Create a cluster of co-workers' Windows 7 PCs for parallel processing in R?

来源:https://stackoverflow.com/questions/36297815/how-to-setup-workers-for-parallel-processing-in-r-using-snowfall-and-multiple-wi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!