glmnet: How do I know which factor level of my response is coded as 1 in logistic regression

こ雲淡風輕ζ 提交于 2019-12-01 18:10:41

Have a look at ?glmnet (page 9 of https://cran.r-project.org/web/packages/glmnet/glmnet.pdf):

y

response variable. ... For family="binomial" should be either a factor
with two levels, or a two-column matrix of counts or proportions (the 
second column is treated as the target class; for a factor, the last
level in alphabetical order is the target class) ...

Isn't it clear now? If you have "a" and "b" as your factor levels, "a" is coded as 0, while "b" is coded 1.

Such treatment is really standard. It is related to how R codes factor automatically, or how you code these factor levels yourself. Look at:

## automatic coding by R based on alphabetical order
set.seed(0); y1 <- factor(sample(letters[1:2], 10, replace = TRUE))
## manual coding
set.seed(0); y2 <- factor(sample(letters[1:2], 10, replace = TRUE),
                   levels = c("b", "a"))

# > y1
# [1] b a a b b a b b b b
# Levels: a b
# > y2
# [1] b a a b b a b b b b
# Levels: b a

# > levels(y1)
# [1] "a" "b"
# > levels(y2)
# [1] "b" "a"

Whether you use glmnet(), or simply glm(), the same thing happens.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!