Capture Arbitrary Conditions with `withCallingHandlers`

我们两清 提交于 2019-12-18 13:14:35

问题


The Problem

I'm trying to write a function that will evaluate code and store the results, including any possible conditions signaled in the code. I've got this working perfectly fine, except for the situation when my function (let's call it evalcapt) is run within an error handling expression.

The problem is that withCallingHandlers will keep looking for matching condition handlers and if someone has defined such a handler outside of my function, my function loses control of execution. Here is simplified example of the problem:

evalcapt <- function(expr) {
  conds <- list()
  withCallingHandlers(
    val <- eval(expr),
    condition=function(e) {
      message("Caught condition of class ", deparse(class(e)))
      conds <<- c(conds, list(e))
  } )
  list(val=val, conditions=conds)
}

myCondition <- simpleCondition("this is a custom condition")
class(myCondition) <- c("custom", class(myCondition))
expr <- expression(signalCondition(myCondition), 25)

tryCatch(evalcapt(expr))          

Works as expected

Caught condition of class c("custom", "simpleCondition", "condition")
$val
[1] 25

$conditions
$conditions[[1]]
<custom: this is a custom condition>

but:

tryCatch(
  evalcapt(expr),               
  custom=function(e) stop("Hijacked `evalcapt`!")  
)

Doesn't work:

Caught condition of class c("custom", "simpleCondition", "condition")
Error in value[[3L]](cond) : Hijacked `evalcapt`!

A Solution I don't Know How To Implement

What I really need is a way of defining a restart right after the condition is signaled in the code which frankly is the way withCallingHandlers appears to work normally (when my handler is the last available handler), but I don't see the restart established when I browse in my handling function and use computeRestarts.

Things That Seem Like Solutions That Won't Work

Use tryCatch

tryCatch does not have the same problem as withCallingHandlers because it does not continue looking for handlers after it finds the first one. The big problem with is it also does not continue to evaluate the code after the condition. If you look at the example that worked above, but sub in tryCatch for withCallingHandlers, the value (25) does not get returned because execution is brought back to the tryCatch frame after the condition is handled.

So basically, I'm looking for a hybrid between tryCatch and withCallingHandlers, one that returns control to the condition signaler, but also stops looking for more handlers after the first one is found.

Break Up The Expression Into Sub-expression, then Use tryCatch

Okay, but how do you break up (and more complex functions with signaled conditions all over the place):

fun <- function(myCondition) {
  signalCondition(myCondition)
  25
}
expr <- expression(fun())

Misc

I looked for the source code associated with the .Internal(.signalCondition()) call to see if I can figure out if there is a behind the scenes restart being set, but I'm out of my depth there. It seems like:

    void R_ReturnOrRestart(SEXP val, SEXP env, Rboolean restart)
    {
        int mask;
        RCNTXT *c;

        mask = CTXT_BROWSER | CTXT_FUNCTION;

        for (c = R_GlobalContext; c; c = c->nextcontext) {
        if (c->callflag & mask && c->cloenv == env)
            findcontext(mask, env, val);
        else if (restart && IS_RESTART_BIT_SET(c->callflag))
            findcontext(CTXT_RESTART, c->cloenv, R_RestartToken);
        else if (c->callflag == CTXT_TOPLEVEL)
            error(_("No function to return from, jumping to top level"));
        }
    }

from src/main/errors.c is doing some of that restart invocation, and this is called by do_signalCondition, but I don't have a clue how I would go about messing with this.


回答1:


I think what you're looking for is to use withRestarts when your special condition is signaled, like from warning:

    withRestarts({
        .Internal(.signalCondition(cond, message, call))
        .Internal(.dfltWarn(message, call))
    }, muffleWarning = function() NULL)

so

evalcapt <- function(expr) {
  conds <- list()
  withCallingHandlers(
    val <- eval(expr),
    custom=function(e) {
      message("Caught condition of class ", deparse(class(e)))
      conds <<- c(conds, list(e))
      invokeRestart("muffleCustom")
  } )
  list(val=val, conditions=conds)
}

expr <- expression(withRestarts({
    signalCondition(myCondition)
}, muffleCustom=function() NULL), 25)

leads to

> tryCatch(evalcapt(expr))   
Caught condition of class c("custom", "simpleCondition", "condition")
$val
[1] 25

$conditions
$conditions[[1]]
<custom: this is a custom condition>


> tryCatch(
+   evalcapt(expr),               
+   custom=function(e) stop("Hijacked `evalcapt`!")  
+ )
Caught condition of class c("custom", "simpleCondition", "condition")
$val
[1] 25

$conditions
$conditions[[1]]
<custom: this is a custom condition>



回答2:


As far as I can tell there isn't and can't be a simple solution to this problem (I'm happy to be proven wrong). The source of the problem can be seen if we look at how tryCatch and withCallingHandlers register the handlers:

.Internal(.addCondHands(name, list(handler), parentenv, environment(), FALSE)) # tryCatch
.Internal(.addCondHands(classes, handlers, parentenv, NULL, TRUE)) # withCallingHandlers

The key point is the last argument, FALSE in tryCatch, TRUE in withCallingHandlers. This argument leads to the gp bit getting set by do_addCondHands > mkHandlerEntry in src/main/errors.c.

That same bit is then consulted by do_signalCondition (still in src/main/errors.c) when a condition is signaled:

// simplified code excerpt from `do_signalCondition

PROTECT(oldstack = R_HandlerStack);
while ((list = findConditionHandler(cond)) != R_NilValue) {
  SEXP entry = CAR(list);
  R_HandlerStack = CDR(list);
  if (IS_CALLING_ENTRY(entry)) {   // <<------------- Consult GP bit
    ... // Evaluate handler
  } else gotoExitingHandler(cond, ecall, entry);   // Evaluate handler and exit
}
R_HandlerStack = oldstack;
return R_NilValue;

Basically, if the GP bit is set, then we evaluate the handler, and keep iterating through the handler stack. If it isn't set, then we run gotExitingHandler which runs the handler but then returns control to the handling control structure rather than resuming the code where the condition was signaled.

Since the GP bit can only tell you to do one of two things, there is no straightforward way to modify the behavior of this call (i.e. you either iterate through all the handlers if using withCallingHandlers, or you stop at the first matching one registered by tryCatch).

I toyed with the idea of traceing signalConditions to add a restart there, but that seems too hackish.




回答3:


With a bit of C you can evaluate an expression within a ToplevelExec() to isolate it from all handlers registered on the stack.

We will expose it at R level in the next rlang version.




回答4:


I may be a bit late, but I've been digging into the condition-system as well, and I think I've found some other solutions.

But first: some reasons why this is necessarily a hard problem, not something that can easily be solved generally.
The question is which function is signalling a condition, and whether this function can continue execution if it throws a condition. Errors are implemented as "just a condition" as well, but most functions don't expect to be continued after they've thrown a stop().
And some functions may pass on a condition, expecting not be bothered by it again.
Normally, this means that control can only be returned after a stop if a function has explicitly said it can accept that: with a restart provided. There may also be other serious conditions that can be signalled, and if a function expects such a condition to always be caught, and you force it to return execution, things break badly.
What should happen when you would have written it as follows and execution would resume?

myfun <- function(deleteFiles=NULL) {
  if (!all(haveRights(deleteFiles))) stop("Access denied")
  file.remove(deleteFiles)
}
withCallingHandlers(val <- eval(myfun(myProtectedFiles)),
  error=function(e) message("I'm just going to ignore everything..."))

If no other handlers are called (which alert the user that stop has been called), the files would be removed, even though this function has a (small) safeguard against that. In the case of an error this is clear, but there could be also cases for other conditions, so I think that's the main reason R doesn't really support it if you stop the passing on of conditions, unless it means halting.

Nonetheless, I think I've found 2 ways of hacking your problem. The first is simply executing expr step by step, which is quite close to Martin Morgans solution, but moves the withRestarts into your function:

evalcapt <- function(expr) {
  conds <- list()
  for (i in seq_along(expr)) {
    withCallingHandlers(
      val <- withRestarts(
        eval(expr[[i]]),
        muffleCustom = function()
          NULL
      ),
      custom = function(e) {
        message("Caught condition of class ", deparse(class(e)))
        conds <<- c(conds, list(e))
        invokeRestart(findRestart("muffleCustom"))
      })
  }
  list(val = val, conditions = conds)
}

The main disadvantage is that this doesn't dig into functions, expr is executed for each instruction at the level it is called.
So if you call evalcapt(myfun()), the for-loop sees this as one instruction. And this one instruction throws a condition --> so does not return --> so you can't see any output that would have been there would you not have been catching anything. OTOH, evalcapt(expression(signalCondition(myCondition), 25)) does work as requested, as this is an expression with 2 elements, each of which is called.

If you want to go hardcore, I think you could try evaluating myfun() step-by-step, but there is always the question how deep you want to go. If myfun() calls myotherfun(), which calls myotherotherfun(), do you want to return control to the point where myfun failed, or myotherfun, or myotherotherfun?
Basically, it's just a guess about what level you want to halt execution, and where you want to resume.

So a second solution: hijack any call to signalCondition. This means you'll probably end up at a quite deep level, although not the very deepest (no primitives, or code that calls .signalCondition).
I think this works best if you're really sure that your custom condition is only thrown by code that is written by you: it means that execution resumes directly after signalCondition.
Which gives me this function:

evalcapt <- function(expr) {
  if(exists('conds', parent.frame(), inherits=FALSE)) {
    conds_backup <- get('conds', parent.frame(), inherits=FALSE)
    on.exit(assign('conds', conds_backup, parent.frame(), inherits=FALSE), add=TRUE)
  } else {
    on.exit(rm('conds', pos=parent.frame(), inherits=FALSE), add=TRUE)
  }
  assign('conds', list(), parent.frame(), inherits=FALSE)
  origsignalCondition <- signalCondition
  if(exists('signalCondition', parent.frame(), inherits=FALSE)) {
    signal_backup <- get('signalCondition', parent.frame(), inherits=FALSE)
    on.exit(assign('signalCondition', signal_backup, parent.frame(), inherits=FALSE), add=TRUE)
  } else {
    on.exit(rm('signalCondition', pos=parent.frame(), inherits=FALSE), add=TRUE)
  }
  assign('signalCondition', function(e) {
    if(is(e, 'custom')) {
      message("Caught condition of class ", deparse(class(e)))
      conds <<- c(conds, list(e))
    } else {
      origsignalCondition(e)
    }
  }, parent.frame())
  val <- eval(expr, parent.frame())
  list(val=val, conditions=conds)
}

It looks way messier, but that's mostly because there are more issues with which environment to use. The differences are that here, I use the calling environment as context, and to hijack signalCondition() that needs to be there too. And afterwards we need to clean up.
But the main use is overwriting signalCondition: if we see a custom error we log it, and return control. If it's another condition, we pass on control.

Here there may be some smaller disadvantages:

  • You may end up in a deeper function, where the bug is the way myfun calls myotherfun, but you end up in myotherfun (or deeper).
  • It only catches occurrences where signalCondition is called. If you call e.g. warning(myCondition), nothing is caught.
  • If a function in another package/another environment calls signalCondition, then it uses its own searchpath, meaning our signalCondition might be bypassed, and base::signalCondition is used instead.
  • When debugging, it's a lot uglier. Variables are assigned in environments where you don't expect them (and then disappear when you exit a function), the scope for different functions may be unclear, parent.frame() might give others results then you'd expect, etc.
  • And as said before: all functions must be able to handle re-entrance after throwing a condition.


来源:https://stackoverflow.com/questions/20572288/capture-arbitrary-conditions-with-withcallinghandlers

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!