ggplot2: how to get robust confidence interval for predictions in geom_smooth?

人走茶凉 提交于 2019-12-10 16:41:55

问题


consider this simple example

dataframe <- data_frame(x = c(1,2,3,4,5,6),
                        y = c(12,24,24,34,12,15))
> dataframe
# A tibble: 6 x 2
      x     y
  <dbl> <dbl>
1     1    12
2     2    24
3     3    24
4     4    34
5     5    12
6     6    15    

dataframe %>% ggplot(., aes(x = x, y = y)) + 
geom_point() + 
geom_smooth(method = 'lm', formula = y~x)

Here the standard errors are computed with the default option. However, I would like to use the robust variance-covariance matrix available in the package sandwich and lmtest

That is, using vcovHC(mymodel, "HC3")

Is there a way to get that in a simple way using the geom_smooth() function?


回答1:


HC robust SEs (simple)

This is easily done now thanks to the estimatr package and its family of lm_robust functions. E.g.

library(tidyverse)
library(estimatr)

dataframe %>% 
  ggplot(aes(x = x, y = y)) + 
  geom_point() + 
  geom_smooth(method = 'lm_robust', formula = y~x, fill="#E41A1C") + ## Robust (HC) SEs
  geom_smooth(method = 'lm', formula = y~x) + ## Just for comparison
  theme_minimal()

HAC robust SES (a bit more legwork)

The one caveat is that estimatr does not yet offer support for HAC (i.e. heteroscedasticity and autocorrelation consistent) SEs a la Newey-West. However, it is possible to obtain these manually with the sandwich package (which is kind of what the original question was asking anyway) and then plot using geom_ribbon().

I'll say for the record that HAC SEs don't make much sense for this particular data set, but here's an example of how you could do it, riffing off this excellent SO answer on a related topic.

reg1 <- lm(y~x, data = dataframe)

## Generate a prediction DF
pred_df <-
  data.frame(predict(reg1, se.fit = T, interval="confidence")) %>% 
  as_tibble() 
## Clean up a little bit (optional)
colnames(pred_df) <- gsub("fit.", "", colnames(pred_df))

## Get the design matrix
X_mat <- model.matrix(reg1)

## Get HAC VCOV matrix and calculate SEs
library(sandwich)
v_hac <- NeweyWest(reg1, prewhite = F, adjust = T) ## HAC VCOV (adjusted for small data sample)
var_fit_hac <- rowSums((X_mat %*% v_hac) * X_mat)  ## Point-wise variance for predicted mean

## Add these to pred_df
pred_df <-
  pred_df %>%
  mutate(se_fit_hac = sqrt(var_fit_hac)) %>%
  mutate(
    lwr_hac = fit - qt(0.975, df=df)*se_fit_hac,
    upr_hac = fit + qt(0.975, df=df)*se_fit_hac
    )

bind_cols(
  dataframe,
  pred_df
  ) %>%
  ggplot(aes(x = x, y = y, ymin=lwr_hac, ymax=upr_hac)) + 
  geom_point() + 
  geom_ribbon(fill="#E41A1C", alpha=0.3, col=NA) + ## Robust (HAC) SEs
  geom_smooth(method = 'lm', formula = y~x) + ## Just for comparison
  theme_minimal()

Note that you could also use this approach to manually calculate and plot other robust SE predictions (e.g. HC1, HC2,etc.) if you so wished. All you would need to do is use the relevant sandwich estimator. For instance, using vcovHC(reg1, type = "HC2") instead of NeweyWest(reg1, prewhite = F, adjust = T) will give you an identical HC-robust CI to the first example that uses the estimatr package.




回答2:


I am very new to this whole robust SE thing, but I was able to generate the following:

zz = '
x y
1     1    12
2     2    24
3     3    24
4     4    34
5     5    12
6     6    15 
'

df <- read.table(text = zz, header = TRUE)
df

library(sandwich)
library(lmtest)

lm.model<-lm(y ~ x, data = df)
coef(lm.model)
se = sqrt(diag(vcovHC(lm.model, type = "HC3")))
fit = predict(lm.model)
predframe <- with(df,data.frame(x,
                                y = fit,
                                lwr = fit - 1.96 * se,
                                upr = fit + 1.96 * se))

library(ggplot2)
ggplot(df, aes(x = x, y = y))+
  geom_point()+
  geom_line(data = predframe)+
  geom_ribbon(data = predframe, aes(ymin = lwr,ymax = upr), alpha = 0.3)



来源:https://stackoverflow.com/questions/45313482/ggplot2-how-to-get-robust-confidence-interval-for-predictions-in-geom-smooth

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!