How do I time out a lapply when a list item fails or takes too long?

偶尔善良 提交于 2019-11-27 02:43:43

问题


For several efforts I'm involved in at the moment, I am running large datasets with numerous parameter combinations through a series of functions. The functions have a wrapper (so I can mclapply) for ease of operation on a cluster. However, I run into two major challenges.

a) My parameter combinations are large (think 20k to 100k). Sometimes particular combinations will fail (e.g. survival is too high and mortality is too low so the model never converges as a hypothetical scenario). It's difficult for me to suss out ahead of time exactly which combinations will fail (life would be easier if I could do that). But for now I have this type of setup:

failsafe <- failwith(NULL, my_wrapper_function)
# This is what I run
# Note that input_variables contains a list of variables in each list item
results <-  mclapply(input_variables, failsafe, mc.cores = 72)
# On my local dual core mac, I can't do this so the equivalent would be:
results <-  llply(input_variables, failsafe,  .progress = 'text')

The skeleton for my wrapper function looks like this:

my_wrapper_function <- function(tlist) {
    run <- tryCatch(my_model(tlist$a, tlist$b, tlist$sA, tlist$Fec, m = NULL) , error=function(e) NULL)
...
return(run)
}

Is this the most efficient approach? If for some reason a particular combination of variables crashes the model, I need it to return a NULL and carry on with the rest. However, I still have issues that this fails less than gracefully.

b) Sometimes a certain combination of inputs does not crash the model but takes too long to converge. I set a limit on the computation time on my cluster (say 6 hours) so I don't waste my resources on something that is stuck. How can I include a timeout such that if a function call takes more than x time on a single list item, it should move on? Calculating the time spent is trivial but a function mid simulation can't be interrupted to check the time, right?

Any ideas, solutions or tricks are appreciated!


回答1:


You may well be able to manage graceful-exits-upon-timout using a combination of tryCatch() and evalWithTimeout() from the R.utils package. See also this post, which presents similar code and unpacks it in a bit more detail.

require(R.utils)

myFun <- function(x) {Sys.sleep(x); x^2}

## evalWithTimeout() times out evaluation after 3.1 seconds, and then
## tryCatch() handles the resulting error (of class "TimeoutException") with 
## grace and aplomb.
myWrapperFunction <- function(i) {
    tryCatch(expr = evalWithTimeout(myFun(i), timeout = 3.1), 
             TimeoutException = function(ex) "TimedOut")
}

sapply(1:5, myWrapperFunction)
# [1] "1"        "4"        "9"        "TimedOut" "TimedOut"


来源:https://stackoverflow.com/questions/10952724/how-do-i-time-out-a-lapply-when-a-list-item-fails-or-takes-too-long

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!