Difference in GLM results between iPython and R

邮差的信 提交于 2019-12-04 09:41:32

A little bit of code cleanup:

set.seed(101)
dat <- data.frame(a=rnorm(500), b=runif(500), 
          c=as.factor(sample(1:5, 500, replace=TRUE)))
library(plyr)
dat <- mutate(dat,
         y0=((jitter(a)^2+(-log10(b)))/(as.numeric(c)/10))+rnorm(500),
         y=(y0>=mean(y0)))                  

fit1 <- glm(y~a+b+c,data=dat,family=binomial('logit'))
fit2 <- update(fit1,control=glm.control(maxit=6))
all.equal(fit1,fit2)

coef(fit1)
## (Intercept)           a           b          c2          c3          c4 
##  1.22283193 -0.07544488 -1.54732712 -0.36477556 -1.46313143 -1.95008291 
##          c5 
## -3.11914945

I agree with @Roland's comment that a reproducible example will help. The most likely difference is in contrast coding, e.g.:

fit3 <- update(fit1,contrasts=list(c=contr.sum))
coef(fit3)
## 
## (Intercept)           a           b          c1          c2          c3 
## -0.15659594 -0.07544488 -1.54732712  1.37942787  1.01465231 -0.08370356 
##          c4 
## -0.57065503 

If you use a model with only continuous predictors, do the results match better?

Update: contrast coding can't be the whole story because the deviance/log-likelihood as well as the coefficients differ.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!