Odds ratios instead of logits in stargazer() LaTeX output

后端 未结 4 1890
感情败类
感情败类 2020-12-13 21:00

When using stargazer to create a LaTeX table on a logistic regression object the standard behaviour is to output logit-values of each model. Is it possible to get exp(logit)

4条回答
  •  离开以前
    2020-12-13 21:23

    There are pieces of the right answer across the various posts, but none of them seem to put it all together. Assuming the following:

    glm_out <- glm(Y ~ X, data=DT, family = "binomial")

    Getting the Odds-Ratio

    For a logistic regression, the regression coefficient (b1) is the estimated increase in the log odds of Y per unit increase in X. So, to get the odds-ratio, we just use the exp function:

    OR <- exp(coef(glm_out))
    
    # pass in coef directly
    stargazer(glm_out, coef = list(OR), t.auto=F, p.auto=F)
    
    # or, use the apply.coef option
    stargazer(glm_out, apply.coef = exp, t.auto=F, p.auto=F)
    

    Getting the Standard Error of the Odds-Ratio

    You cannot simply use apply.se = exp to get the Std. Error for the Odds Ratio

    Instead, you have to use the function: Std.Error.OR = OR * SE(coef)

    # define a helper function to extract SE from glm output
    se.coef <- function(glm.output){sqrt(diag(vcov(glm.output)))}
    
    # or, you can use the arm package
    se.coef <- arm::se.coef
    
    #Get the odds ratio
    OR <- exp(coef(glm_out))
    
    # Then, we can get the `StdErr.OR` by multiplying the two:
    Std.Error.OR <-  OR * se.coef(glm_out)
    

    So, to get it into stargazer, we use the following:

    # using Std Errors
    stargazer(glm_out, coef=list(OR), se = list(Std.Error.OR), t.auto=F, p.auto=F)
    

    Computing CIs for the Odds-Ratio

    Confidence intervals in an odds-ratio setting are not symmetric. So, we cannot just do ±1.96*SE(OR) to get the CI. Instead, we can compute it from the original log odds exp(coef ± 1.96*SE).

    # Based on normal distribution to compute Wald CIs:
    # we use confint.default to obtain the conventional confidence intervals
    # then, use the exp function to get the confidence intervals
    
    CI.OR <- as.matrix(exp(confint.default(glm_out)))
    

    So, to get it into stargazer, we use the following:

    # using ci.custom
    stargazer(glm_out, coef=list(OR), ci.custom = list(CI.OR), t.auto=F, p.auto=F, ci = T)
    
    # using apply.ci
    stargazer(glm_out, apply.coef = exp, apply.ci = exp, t.auto=F, p.auto=F, ci = T)
    

    NOTE ABOUT USING CONFIDENCE INTERVALS FOR SIGNIFICANCE TESTS:

    Do not use the Confidence Intervals of Odds Ratios to compute significance (see note and reference at the bottom). Instead, you can do it using the log odds:

    z <- coef(glm_out)/se.coef(glm_out)
    

    And, use that to get the p.values for significance tests:

    pvalue <- 2*pnorm(abs(coef(glm_out)/se.coef(glm_out)), lower.tail = F)
    

    (source: https://data.princeton.edu/wws509/r/c3s1)

    See this link for more detailed discussion on statistical testing: https://stats.stackexchange.com/questions/144603/why-do-my-p-values-differ-between-logistic-regression-output-chi-squared-test

    It is important to note however, that unlike the p value, the 95% CI does not report a measure’s statistical significance. In practice, the 95% CI is often used as a proxy for the presence of statistical significance if it does not overlap the null value (e.g. OR=1). Nevertheless, it would be inappropriate to interpret an OR with 95% CI that spans the null value as indicating evidence for lack of association between the exposure and outcome. source: Explaining Odds Ratios

提交回复
热议问题