R: How make dump.frames() include all variables for later post-mortem debugging with debugger()

眉间皱痕 提交于 2020-01-14 12:44:08

问题


I have the following code which provokes an error and writes a dump of all frames using dump.frames() as proposed e. g. by Hadley Wickham:

a <- -1
b <- "Hello world!"
bad.function <- function(value)
{
  log(value)                  # the log function may cause an error or warning depending on the value
}

tryCatch( {
             a.local.value <- 42
             bad.function(a)
             bad.function(b)
          },
          error = function(e)
          {
            dump.frames(to.file = TRUE)
          })

When I restart the R session and load the dump to debug the problem via

load(file = "last.dump.rda")
debugger(last.dump)

I cannot find my variables (a, b, a.local.value) nor my function "bad.function" anywhere in the frames.

This makes the dump nearly worthless to me.

What do I have to do to see all my variables and functions for a decent post-mortem analysis?

The output of debugger is:

> load(file = "last.dump.rda")
> debugger(last.dump)
Message:  non-numeric argument to mathematical functionAvailable environments had calls:
1: tryCatch({
    a.local.value <- 42
    bad.function(a)
    bad.function(b)
2: tryCatchList(expr, classes, parentenv, handlers)
3: tryCatchOne(expr, names, parentenv, handlers[[1]])
4: value[[3]](cond)

Enter an environment number, or 0 to exit  
Selection: 

PS: I am using R3.3.2 with RStudio for debugging.


回答1:


Note that it is often more productive to work with the R Core team rather than just telling that R has a bug. It clearly has no bug, here, as it behaves exactly as documented.

Also there is no problem if you work interactively, as you have full access to your workspace (which may be LARGE) there, so the problem applies only to batch jobs (as you've mentioned).

What we rather have here is a missing feature and feature requests (and bug reports!) should happen on the R bug site (aka _'R bugzilla'), https://bugs.r-project.org/ ... typically however after having read the corresponding page on the R website: https://www.r-project.org/bugs.html.

Note that R bugzilla is searchable, and in the present case, you'd pretty quickly find that Andreas Kersting made a nice proposal (namely as a wish, rather than claiming a bug), https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17116 and consequently I had added the missing feature to R, on Aug.16, already. Yes, of course, the development version of R, aka R-devel. See also today's thread on the R-devel mailing list, https://stat.ethz.ch/pipermail/r-devel/2016-November/073378.html




回答2:


Update Nov. 20, 2016: Note that it is not an R bug (see answer of Martin Maechler). I did not change my answer for reproducibility. The described work around still applies.

Summary

I think dump.frames(to.file = TRUE) is currently an anti pattern (or probably a bug) in R if you want to debug errors of batch jobs in a new R session.

You should better replace it with

  dump.frames()
  save.image(file = "last.dump.rda")

or

options(error = quote({dump.frames(); save.image(file = "last.dump.rda")}))

instead of

options(error = dump.frames)

because the global environment (.GlobalEnv = the user workspace you normally create your objects) is included then in the dump while it is missing when you save the dump directly via dump.frames(to.file = TRUE).

Impact analysis

Without the .GlobalEnv you loose important top level objects (and their current values ;-) to understand the behaviour of your code that led to an error!

Especially in case of errors in "non-interactive" R batch jobs you are lost without .GlobalEnv since you can debug only in a newly started (empty) interactive workspace where you then can only access the objects in the call stack frames.

Using the code snippet above you can examine the object values that led to the error in a new R workspace as usual via:

load(file = "last.dump.rda")
debugger(last.dump)

Background

The implementation of dump.frames creates a variable last.dump in the workspace and fills it with the environments of the call stack (sys.frames(). Each environment contains the "local variables" of the called function). Then it saves this variable into a file using save().

The frame stack (call stack) grows with each call of a function, see ?sys.frames:

.GlobalEnv is given number 0 in the list of frames. Each subsequent function evaluation increases the frame stack by 1 and the [...] environment for evaluation of that function are returned by [...] sys.frame with the appropriate index.

Observe that the .GlobalEnv has the index number 0.

If I now start debugging the dump produced by the code in the question and select the frame 1 (not 0!) I can see a variable parentenv which points (references) the .GlobalEnv:

Browse[1]> environmentName(parentenv)
[1] "R_GlobalEnv"

Hence I believe that sys.frames does not contain the .GlobalEnv and therefore dump.frames(to.file = TRUE) neither since it only stores the sys.frames without all other objects of the .GlobalEnv.

Maybe I am wrong, but this looks like an unwanted effect or even a bug. Discussions welcome!

References

https://cran.r-project.org/doc/manuals/R-exts.pdf

Excerpt from section 4.2 Debugging R code (page 96):

Because last.dump can be looked at later or even in another R session, post-mortem debug- ging is possible even for batch usage of R. We do need to arrange for the dump to be saved: this can be done either using the command-line flag --save to save the workspace at the end of the run, or via a setting such as

options(error = quote({dump.frames(to.file=TRUE); q()}))



来源:https://stackoverflow.com/questions/40421552/r-how-make-dump-frames-include-all-variables-for-later-post-mortem-debugging

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!