问题
being able to multithread on windows would be awesome, but perhaps this problem is harder than i had thought.. :(
inside of survey:::svyby.default
there is a a block that's either lapply
or mclapply
depending on multicore=TRUE
and your operating system. windows users get forced into the lapply
loop no matter what, and i was wondering if there's any way to go down the mclapply
path instead.. speeding up the computation.
i don't know too much about the innards of parallel processing, but i did some experiments to see if any of the windows-acceptable alternatives would work. first i tried overwriting mclapply with
mclapply <-
function( X , FUN , ... ){
clusterApply(
x = X ,
fun = FUN ,
cl = makeCluster( detectCores() ) , ... )
}
next i used fixInNamespace( svyby.default , "survey" )
to remove the line
if (multicore) parallel:::closeAll()
but that only got me to the point where
> svyby(~api99, ~stype, dclus1, svymean , multicore=TRUE )
Error in checkForRemoteErrors(val) :
3 nodes produced errors; first error: object 'svymean' not found
回答1:
quoting Dr. Thomas Lumley, author of the R survey
package in response to my inquiry--
No. This approach to parallelising relies on forking, which Windows doesn't support.
It would be necessary to rewrite it to use clusterApply(), and I'm pretty sure the communications overhead would eat the speed gain. With forking, the child process gets a copy of the parent process data for free -- it's all done by the virtual<->physical memory translation hardware -- but with the cluster approach R has to send data to the child process explicitly.
来源:https://stackoverflow.com/questions/24737166/is-it-possible-to-get-the-r-survey-packages-svyby-function-multicore-paramet