dplyr: Evaluation error: object '.' not found with gamlss but all good with lm, gam, glm methods

问题

Context: tidyverse and dplyr environment/work-flow.

I'd appreciate insights into how to resolve the following issue, which I have encountered while trying to work with collections of regression results.

This minimal reproducible shows the issue

mtcars %>% 
  gamlss(mpg ~ hp + wt + disp, data = .) %>% 
  model.frame()

The example below illustrates a broader context and works as expected (producing the images shown). It also works if all I do is change ~lm(...) to be ~glm(...) or ~gam(...):

library(tidyverse)
library(broom)
library(gamlss)

library(datasets)

mtcars %>% 
  nest(-am) %>% 
  mutate(am = factor(am, levels = c(0, 1), labels = c("automatic", "manual")),
         fit = map(data, ~lm(mpg ~ hp + wt + disp, data = .)),
         results = map(fit, augment)) %>% 
  unnest(results) %>% 
  ggplot(aes(x = mpg, y = .fitted)) +
    geom_abline(intercept = 0, slope = 1, alpha = .2) +  # Line of perfect fit
    geom_point() +
    facet_grid(am ~ .) +
    labs(x = "Miles Per Gallon", y = "Predicted Value") +
    theme_bw()

However, If I try to use ~gamlss(...) as follows:

mtcars %>% 
  nest(-am) %>% 
  mutate(am = factor(am, levels = c(0, 1), labels = c("automatic", "manual")),
         fit = map(data, ~gamlss(mpg ~ hp + wt + disp, data = .)),
         results = map(fit, augment)) %>% 
  unnest(results) %>% 
  ggplot(aes(x = mpg, y = .fitted)) +
    geom_abline(intercept = 0, slope = 1, alpha = .2) +  # Line of perfect fit
    geom_point() +
    facet_grid(am ~ .) +
    labs(x = "Miles Per Gallon", y = "Predicted Value") +
    theme_bw()

I observe the following error:

GAMLSS-RS iteration 1: Global Deviance = 58.7658 
GAMLSS-RS iteration 2: Global Deviance = 58.7658 
GAMLSS-RS iteration 1: Global Deviance = 76.2281 
GAMLSS-RS iteration 2: Global Deviance = 76.2281 
******************************************************************
Family:  c("NO", "Normal") 

Call:  gamlss(formula = mpg ~ hp + wt + disp, data = .) 

Fitting method: RS() 

------------------------------------------------------------------
Mu link function:  identity
Mu Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 43.811721   3.387118  12.935 4.05e-07 ***
hp           0.001768   0.021357   0.083  0.93584    
wt          -6.982534   1.998827  -3.493  0.00679 ** 
disp        -0.019569   0.021460  -0.912  0.38559    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

------------------------------------------------------------------
Sigma link function:  log
Sigma Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)   0.8413     0.1961    4.29  0.00105 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

------------------------------------------------------------------
No. of observations in the fit:  13 
Degrees of Freedom for the fit:  5
      Residual Deg. of Freedom:  8 
                      at cycle:  2 

Global Deviance:     58.76579 
            AIC:     68.76579 
            SBC:     71.59054 
******************************************************************
Error in mutate_impl(.data, dots) : 
  Evaluation error: object '.' not found.
In addition: Warning messages:
1: Deprecated: please use `purrr::possibly()` instead 
2: Deprecated: please use `purrr::possibly()` instead 
3: Deprecated: please use `purrr::possibly()` instead 
4: Deprecated: please use `purrr::possibly()` instead 
5: Deprecated: please use `purrr::possibly()` instead 
6: In summary.gamlss(model) :
  summary: vcov has failed, option qr is used instead

15: stop(list(message = "Evaluation error: object '.' not found.", 
        call = mutate_impl(.data, dots), cppstack = NULL))
14: .Call(`_dplyr_mutate_impl`, df, dots)
13: mutate_impl(.data, dots)
12: mutate.tbl_df(tbl_df(.data), ...)
11: mutate(tbl_df(.data), ...)
10: as.data.frame(mutate(tbl_df(.data), ...))
9: mutate.data.frame(., am = factor(am, levels = c(0, 1), labels = c("automatic", 
       "manual")), fit = map(data, ~gamlss(mpg ~ hp + wt + disp, 
       data = .)), results = map(fit, augment))
8: mutate(., am = factor(am, levels = c(0, 1), labels = c("automatic", 
       "manual")), fit = map(data, ~gamlss(mpg ~ hp + wt + disp, 
       data = .)), results = map(fit, augment))
7: function_list[[i]](value)
6: freduce(value, `_function_list`)
5: `_fseq`(`_lhs`)
4: eval(quote(`_fseq`(`_lhs`)), env, env)
3: eval(quote(`_fseq`(`_lhs`)), env, env)
2: withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
1: mtcars %>% nest(-am) %>% mutate(am = factor(am, levels = c(0, 
       1), labels = c("automatic", "manual")), fit = map(data, ~gamlss(mpg ~ 
       hp + wt + disp, data = .)), results = map(fit, augment)) %>% 
       unnest(results) %>% ggplot(aes(x = mpg, y = .fitted))

Does anyone have suggestion as to what needs to change to make this example work as expected?

I'd appreciate any insights into what is going wrong. Why it does not work. How to diagnose this sort of issue(s).

回答1:

model.frame.gamlss does not consider the original environment of the data argument properly.

See my comments in the code below:

model.frame.gamlss <- function(formula, what = c("mu", "sigma", "nu", "tau"), parameter = NULL, ...) 
{
    object <- formula
    dots <- list(...)
    what <- if (!is.null(parameter)) {
        match.arg(parameter, choices = c("mu", "sigma", "nu", "tau"))
    } else match.arg(what)
    Call <- object$call
    parform <- formula(object, what)
    data <- if (!is.null(Call$data)) {
        # problem here, as Call$data is .
        eval(Call$data)
        # instead, this would work:
        # eval(Call$data, environment(formula$mu.terms))
        # (there is no formula$terms, just mu.terms and sigma.terms)
    } else {
        environment(formula$terms)
    }
    Terms <- terms(parform)
    mf <- model.frame(
        Terms, 
        data, 
        xlev = object[[paste(what, "xlevels", sep = ".")]]
    )
    mf
}

I guess an issue should be filed with the gamlss maintainer(s), unless this has been done already.

回答2:

Building on RolnadASc's partial answer...

The intermediate datasets and the chart output of the original problem are reproduced by the following. An improvement from using this approach is that we do not require the intermediate step creating fit.

library(tidyverse)
library(broom)
library(gamlss)
library(datasets)

model.frame.gamlss <- function(formula, what = c("mu", "sigma", "nu", "tau"), parameter = NULL, ...) 
{
    object <- formula
    dots <- list(...)
    what <- if (!is.null(parameter)) {
        match.arg(parameter, choices = c("mu", "sigma", "nu", "tau"))
    } else match.arg(what)
    Call <- object$call
    parform <- formula(object, what)
    data <- if (!is.null(Call$data)) {
        # problem here, as Call$data is .
        # eval(Call$data)
        # instead, this would work:
        eval(Call$data, environment(formula$mu.terms))
        # (there is no formula$terms, just mu.terms and sigma.terms)
    } else {
        environment(formula$terms)
    }
    Terms <- terms(parform)
    mf <- model.frame(
        Terms, 
        data, 
        xlev = object[[paste(what, "xlevels", sep = ".")]]
    )
    mf
}
aug_func <- function(df){
          augment(gamlss(mpg ~ hp + wt + disp, data=df))
        }

mtcars %>% 
  mutate(am = factor(am, levels = c(0, 1), labels = c("automatic", "manual"))) %>%
  group_by(am) %>%
  do(aug=aug_func(df=.)) %>%
  unnest(aug) %>%
    ggplot(aes(x = mpg, y = .fitted)) +
      geom_abline(intercept = 0, slope = 1, alpha = .2) +  # Line of perfect fit
      geom_point() +
      facet_grid(am ~ .) +
      labs(x = "Miles Per Gallon gamlss", y = "Predicted Value gamlss") +
      theme_bw()

来源：https://stackoverflow.com/questions/48979322/dplyr-evaluation-error-object-not-found-with-gamlss-but-all-good-with-lm

标签

dplyr

gamlss