Formatted latex regression tables with multiple models from broom output?

冷暖自知 提交于 2021-02-16 07:01:33

问题


I have several models such as the example below for which I have estimates, standard errors, p-values, r2 etc. as data.frames in tidy format, but I don't have the original model objects (analysis was run on a different machine).

require(broom)
model <- lm(mpg ~ hp + cyl, mtcars)
tidy_model <- tidy(model)
glance_model <- glance(model)

# tidy_model
# # A tibble: 3 x 5
#   term        estimate std.error statistic  p.value
#   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
# 1 (Intercept)  36.9       2.19       16.8  1.62e-16
# 2 hp           -0.0191    0.0150     -1.27 2.13e- 1
# 3 cyl          -2.26      0.576      -3.93 4.80e- 4
# glance(model)
# # A tibble: 1 x 11
#   r.squared adj.r.squared sigma ...
# *     <dbl>         <dbl> <dbl>     ...
# 1     0.760         0.743  3.06      ...

There exist several packages (e.g. stargazer or texreg) which transform one or more model objects (lm, glm, etc.) into well-formatted regression tables side-by-side, see below for an example of texreg:

require(texreg)
screenreg(list(model1, model1)
# =================================
#              Model 1    Model 2  
# ---------------------------------
# (Intercept)  34.66 ***  34.66 ***
#              (2.55)     (2.55)   
# cyl          -1.59 *    -1.59 *  
#              (0.71)     (0.71)   
# disp         -0.02      -0.02    
#              (0.01)     (0.01)   
# ---------------------------------
# R^2           0.76       0.76    
# Adj. R^2      0.74       0.74    
# Num. obs.    32         32       
# RMSE          3.06       3.06    
# =================================
# *** p < 0.001, ** p < 0.01, * p < 0.05

Is there a similar package that uses tidy estimation results produced with broom as inputs rather than model objects to produce a table such as the above example?


回答1:


Is there a similar package that uses tidy estimation results produced with broom as inputs

Not to my knowledge, but stargazer allows you to use custom inputs to generate regression tables. This allows us to create "fake" shell tables that we can populate with values from the tidy table. Using your example

# create fake models
dat <- lapply(tidy_model$term, function(...) rnorm(10))
dat <- as.data.frame(setNames(dat, c("mpg", tidy_model$term[-1])))
f <- as.formula(paste("mpg ~", paste(tidy_model$term[-1], collapse = " + ")))
fit <- lm(f, dat)

# set up model statistics
fit_stats <- data.frame(labels = names(glance_model),
                        mod1 = round(unlist(glance_model), 3),
                        mod2 = round(unlist(glance_model), 3),
                        row.names = NULL,
                        stringsAsFactors = FALSE)

We can then feed these values into stargazer:

library(stargazer)

stargazer(fit, fit, type = "text", 
  coef = list(tidy_model$estimate, tidy_model$estimate),
  se = list(tidy_model$std.error, tidy_model$std.error),
  add.lines = lapply(1:nrow(fit_stats), function(i) unlist(fit_stats[i, ])),
  omit.table.layout = "s"
)
# ==========================================
#                   Dependent variable:     
#               ----------------------------
#                           mpg             
#                    (1)            (2)     
# ------------------------------------------
# hp                -0.019        -0.019    
#                  (0.015)        (0.015)   

# cyl             -2.265***      -2.265***  
#                  (0.576)        (0.576)   

# Constant        36.908***      36.908***  
#                  (2.191)        (2.191)   

# ------------------------------------------
# r.squared         0.741          0.741    
# adj.r.squared     0.723          0.723    
# sigma             3.173          3.173    
# statistic         41.422        41.422    
# p.value             0              0      
# df                  3              3      
# logLik           -80.781        -80.781   
# AIC              169.562        169.562   
# BIC              175.425        175.425   
# deviance         291.975        291.975   
# df.residual         29            29      
# ==========================================
# Note:          *p<0.1; **p<0.05; ***p<0.01



回答2:


I had another look at texreg, inspired by this answer, and there is a more native way to do this by defining an additional extraction method for texreg in addition to the previous answer:

extract_broom <- function(tidy_model, glance_model) {
  # get estimates/standard errors from tidy
  coef <- tidy_model$estimate
  coef.names <- as.character(tidy_model$term)
  se <- tidy_model$std.error
  pvalues <- tidy_model$p.value
  # get goodness-of-fit statistics from glance
  glance_transposed <- as_tibble(cbind(name = names(glance_model), t(glance_model)))
  gof.names <- as.character(glance_transposed$name)
  gof <- as.double(glance_transposed$value)
  gof.decimal <- gof %% 1 > 0
  tr_object <- texreg::createTexreg(coef.names = coef.names,
                                    coef = coef,
                                    se = se,
                                    pvalues = pvalues,
                                    gof.names = gof.names,
                                    gof = gof,
                                    gof.decimal = gof.decimal)
  return(tr_object)
}

This results in the following output:

texreg_model <- extract_broom(tidy_model, glance_model)
screenreg(list(texreg_model, texreg_model))

# =====================================
#                Model 1     Model 2   
# -------------------------------------
# (Intercept)     36.91 ***   36.91 ***
#                 (2.19)      (2.19)   
# hp              -0.02       -0.02    
#                 (0.02)      (0.02)   
# cyl             -2.26 ***   -2.26 ***
#                 (0.58)      (0.58)   
# -------------------------------------
# r.squared        0.74        0.74    
# adj.r.squared    0.72        0.72    
# sigma            3.17        3.17    
# statistic       41.42       41.42    
# p.value          0.00        0.00    
# df               3           3       
# logLik         -80.78      -80.78    
# AIC            169.56      169.56    
# BIC            175.42      175.42    
# deviance       291.97      291.97    
# df.residual     29          29       
# =====================================
# *** p < 0.001, ** p < 0.01, * p < 0.05


来源:https://stackoverflow.com/questions/52027811/formatted-latex-regression-tables-with-multiple-models-from-broom-output

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!