问题
All,
I ran a logistic Regression on a set of variables both categorical and continuous with a binary event as dependent variable.
Now post modelling, I observe a set of categorical variables showing negative sign which I presume is to understand that if that categorical variable occurs high number of times then the probability of the dependent variable occurring is low.
But when I see the % of occurrence of that independent variable I see the reverse trend happening. hence the result seems to be counter intuitive. Any reason why this could happen. I have tried explaining below with a pseudo example.
Dependent Variable - E Predictors: 1. Categorical Var - Cat1 with 2 levels (0,1) 2. Continuous Var - Con1 3. Categorical Var - Cat2 with 2 levels (0,1) Post Modelling: Say all are significant and the coefficients are like below, Cat1 - (-0.6) Con1- (0.3) Cat2 - (-0.4)
But when I calculate the % of occurrence of Event E on Cat 1, I observe that the % of occurence is high when Cat1 is 1, which I think is counter intuitive.
Pls help in understanding this.
回答1:
Coefficients of logistic regression are not directly related to the chage of probability of the event, rather it's a relative measure of the change in the odds of the event. This article has detailed derivation of how to interpret the coefficients of logistic regression. In your context, the coefficient for CAT1 is -0.6 means p(E|CAT1 = 1) < p(E|CAT1 = 0) and it's not related to exactly how big p(E|CAT1 = 1) is.
来源:https://stackoverflow.com/questions/35291428/interpreting-coefficients-from-logistic-regression-from-r