System time for parallel and serial processing

◇◆丶佛笑我妖孽 提交于 2019-12-07 04:38:32

问题


I'm running a Bayesian MCMC probit model, and I'm trying to implement it in parallel. I'm getting confusing results about the performance of my machine when comparing parallel to serial. I don't have a lot of experience doing parallel processing, so it is possible I'm not doing it right.

I'm using MCMCprobit in the MCMCpack package for the probit model, and for parallel processing I'm using parLapply in the parallel package.

Here's my code for the serial run, and the results from system.time:

system.time(serial<-MCMCprobit(formula=econ_model,data=mydata,mcmc=10000,burnin=100))

   user  system elapsed 
 657.36   73.69  737.82

Here's my code for the parallel run:

#Setting up the functions for parLapply:
probit_modeling <- function(...) {
  args <- list(...)
  library(MCMCpack)
  MCMCprobit(formula=args$model, data=args$data, burnin=args$burnin, mcmc=args$mcmc, thin=1)
}

probit_Parallel <- function(mc, model, data,burnin,mcmc) {
  cl <- makeCluster(mc)
  ## To make this reproducible:
  clusterSetRNGStream(cl, 123)
  library(MCMCpack) # needed for c() method on master
  probit.res <- do.call(c, parLapply(cl, seq_len(mc), probit_modeling, model=model, data=data, 
                                        mcmc=mcmc,burnin=burnin))
  stopCluster(cl)
  return(probit.res)
}


system.time(test<-probit_Parallel(model=econ_model,data=mydata,mcmc=10000,burnin=100,mc=2))

And the results from system.time:

   user  system elapsed 
   0.26    0.53 1097.25 

Any ideas why user and system times would be so much shorter for the parallel process, but the elapsed time so much longer? I tried it at shorter MCMC runs (100 and 1000), and the story is the same. I'm assuming I'm making a mistake somewhere.

Here are my computer specifications:

  • R 3.1.3
  • 8 GB memory
  • Windows 7 64 bit
  • Intel Core i5 2520M CPU, dual core

回答1:


It appears to me that both of the workers are doing as much work as is performed in the sequential version. The workers should only perform a fraction of the total work in order to execute faster than the sequential version of the code. That might be accomplished by dividing mcmc by the number of workers in this example, although that may not be what you real want to do.

I think that explains the long elapsed time reported by system.time. The "user" and "system" times are short because they are times for the master process which uses very little CPU time when executing parLapply: the real CPU time is used by the workers which isn't being reported by system.time.



来源:https://stackoverflow.com/questions/30669328/system-time-for-parallel-and-serial-processing

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!