update() inside a function only searches the global environment?

问题

I tried to write a wrapper function to do likelihood ratio tests in batches. I tried to include update() to update the initial model. However, it seems that instead of looking for objects inside the function, it searches for objects in the global environment.

fake <- data.frame(subj= rep(1:5, 4), 
                   factor1 = rep(LETTERS[c(1,2,1,2)], each=5), 
                   factor2 = rep(letters[1:2], each=10), 
                   data=sort(rlnorm(20)))

foo <- function(){
                  temp <- fake
                  model1 <- lmer(data~factor1*factor2 + (1 |subj), temp)
                  model1a <- update(model1, ~.-factor1:factor2)
                  model1a}

And it gives an error message below:

Error in eval(expr, envir, enclos) : object 'factor1' not found

Is there anyway to make update() search within the function? Thank you!

EDIT:

I made a mistake. I wanted to pass "temp" to lmer, not "fake".

EDIT2: One convenient solution suggested is to simply specify the data object. Although update() now has no problem with this, anova() seems to think that the models I am trying to compare are based on different data objects

 foo <- function(){
                  temp <- fake
                  model1 <- lmer(data~factor1*factor2 + (1 |subj), data=temp)
                  model1a <- update(model1, ~.-factor1:factor2, data=temp)
                  anova(model1, model1a)
            }
 foo()

I get an error message:

 Error in anova(model1, model1b) : 
   all models must be fit to the same data object

I suppose this error goes beyond update(). But I wonder if anyone knows how this can be resolved. Note that if I write the function without using update() and instead spell out the models (see below), the error above goes away:

 foo <- function(){
                  temp <- fake
                  model1 <- lmer(data~factor1*factor2 + (1 |subj), data=temp)
                  model1a <- lmer(data~factor1 + factor2 + (1 |subj), data=temp)
                  anova(model1, model1a)
            }
 foo()

 Data: temp
 Models:
 model1a: data ~ factor1 + factor2 + (1 | subj)
 model1: data ~ factor1 * factor2 + (1 | subj)
         Df     AIC    BIC  logLik  Chisq Chi Df Pr(>Chisq)  
 model1a  5 -4.6909 3.7535  7.3454                           
 model1   6 -8.8005 1.3327 10.4003 6.1097      1    0.01344 *
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

EDIT 3: It seems that the issue is with anova(). I also tried the suggestion by @hadley

foo2 <- function(){
  my_update <- function(mod, formula = NULL, data = NULL) {
  call <- getCall(mod)
  if (is.null(call)) {
    stop("Model object does not support updating (no call)", call. = FALSE)
  }
  term <- terms(mod)
  if (is.null(term)) {
    stop("Model object does not support updating (no terms)", call. = FALSE)
  }
  if (!is.null(data)) call$data <- data
  if (!is.null(formula)) call$formula <- update.formula(call$formula, formula)
  env <- attr(term, ".Environment")
  eval(call, env, parent.frame())}

      model1 <- lmer(data~factor1*factor2 + (1 |subj), temp)
      model1a <- my_update(model1, ~.-factor1:factor2)
      anova(model1, model1a)
 }
 foo2()

I got an error message as shown below:

 Error in as.data.frame.default(data) : 
   cannot coerce class 'structure("mer", package = "lme4")' into a data.frame

回答1:

I've been bitten by this behaviour before too, so I wrote my own version of update. It evaluates everything in the environment of the formula, so it should be fairly robust.

my_update <- function(mod, formula = NULL, data = NULL) {
  call <- getCall(mod)
  if (is.null(call)) {
    stop("Model object does not support updating (no call)", call. = FALSE)
  }
  term <- terms(mod)
  if (is.null(term)) {
    stop("Model object does not support updating (no terms)", call. = FALSE)
  }

  if (!is.null(data)) call$data <- data
  if (!is.null(formula)) call$formula <- update.formula(call$formula, formula)
  env <- attr(term, ".Environment")

  eval(call, env, parent.frame())
}

library(nlme4)

fake <- data.frame(
  subj = rep(1:5, 4), 
  factor1 = rep(LETTERS[c(1,2,1,2)], each = 5), 
  factor2 = rep(letters[1:2], each = 10), 
  data = sort(rlnorm(20)))

foo <- function() {
  temp <- fake
  model1 <- lmer(data ~ factor1 * factor2 + (1 | subj), fake)
  model1a <- my_update(model1, ~ . - factor1:factor2)
  model1a
}
foo()

回答2:

Although I really like @Hadley's answer (and will likely use that function myself), you can also specify a data argument in the update function. (Here, I assumed you wanted to pass temp to your models.)

model1a <- update(model1, ~.-factor1:factor2, data = temp)

EDIT

If you're looking to compare models with anova, update will mung up the name of the data argument and "trick" anova into believing that the two models were fit using two different datasets. Updating only the formula and creating a new model will avoid this:

foo <- function(){
                  temp <- fake
                  model1 <- lmer(data~factor1*factor2 + (1 |subj), data=temp)
                  newForm <- update.formula(formula(model1), ~.-factor1:factor2)
                  model1a <- lmer(newForm, data=temp)
                  anova(model1, model1a)
            }

来源：https://stackoverflow.com/questions/13690184/update-inside-a-function-only-searches-the-global-environment

标签

function

environment