mgcv: How to use 'exclude' argument in predict.gam?

北战南征 提交于 2020-11-28 08:23:37


I have a model structured as follows, and I would like to extract the predicted values while ignoring the random effect. As specified in ?predict.gam and here, I am using the exclude argument, but I am getting an error. Where is my mistake?

dt <- data.frame(n1 = runif(500, min=0, max=1),
             n2 = rep(1:10,50), 
             n3 = runif(500, min=0, max=2),
             n4 = runif(500, min=0, max=2),
             c1 = factor(rep(c("X","Y"),250)),
             c2 = factor(rep(c("a", "b", "c", "d", "e"), 100)))

mod = gam(n1 ~ 
           s(n2, n3, n4, by=c1) +
           s(c2, bs="re"),

newd=data.table(expand.grid(n1=seq(min(dt$n1), max(dt$n1), 0.5), 
                        n3=seq(min(dt$n3), max(dt$n3), 0.5),
                        n4=seq(min(dt$n4), max(dt$n4), 0.5),
                        c1=c("X", "Y")))
newd$pred <- predict.gam(mod, newd, exclude = "s(c2)")

In predict.gam(mod, newd, exclude = "s(c2)"): not all required variables have been supplied in  newdata! 


exclude does not work in the way as you assumed. You still need to provide all variables in your newd for predict.gam. See my this answer for what is behind predict.gam.

Here is what you need to do:

## pad newd with an arbitrary value for variable c2
newd$c2 <- "a"
## termwise prediction
pt <- predict.gam(mod, newd, type = "terms", exclude = "s(c2)")
## linear predictor without random effect
lp_no_c2 <- rowSums(pt) + attr(pt, "constant")

