I am trying to run code on several cores (I tried both the snow and parallel packages). I have
cl <- makeCluster(2)
y <- 1:1
The nodes don't know about the y in the global environment on the master. You need to tell them somehow.
library(parallel)
cl <- makeCluster(2)
y <- 1:10
# add y to function definition and parSapply call
parSapply(cl, 1:5, function(x,y) x + y, y)
# export y to the global environment of each node
# then call your original code
clusterExport(cl, "y")
parSapply(cl, 1:5, function(x) x + y)
It is worth mentioning that your example will work if parSapply is called from within a function, although the real issue is where the function function(x) x + y is created. For example, the following code works correctly:
library(parallel)
fun <- function(cl, y) {
parSapply(cl, 1:5, function(x) x + y)
}
cl <- makeCluster(2)
fun(cl, 1:10)
stopCluster(cl)
This is because functions that are created in other functions are serialized along with the local environment in which they were created, while functions created from the global environment are not serialized along with the global environment. This can be useful at times, but it can also lead to a variety a problems if you're not aware of the issue.