问题
I am running a logit regression on some data. My dependent variable is binary as are all but one of my independent variables.
When I run my regression, stata drops many of my independent variables and gives the error:
"variable name" != 0 predicts failure perfectly
"variable name" dropped and "a number" obs not used
I know for a fact that some of the variables dropped don't predict failure perfectly. In other words, the dependent variables can take on the value 1 for either the value 1 or 0 of the independent variable.
Why is this happening and how can I resolve it?
回答1:
Bivariate cross tabulation does not show the problem. Try this:
http://www.stata.com/support/faqs/statistics/completely-determined-in-logistic-regression/index.html
First confirm that this is what is happening [collinear]. (For your data, replace x1 and x2 with the independent variables of your model.)
Number covariate patterns:
egen pattern = group(x1 x2)
Identify pattern with only one outcome:
logit y x1 x2 predict p summarize p
- the extremes of p will be almost 0 or almost 1 tab pattern if p < 1e-7 // (use a value here slightly bigger than the min)
- or in the above use "if p > 1 - 1e-7" if p is almost 1 list x1 x2 if pattern == XXXX // (use the value here from the tab step)
- the above identifies the covariate pattern
The covariate pattern that predicts outcome perfectly may be meaningful to the researcher or may be an anomaly due to having many variables in the model.
Now you must get rid of the collinearity:
logit y x1 x2 if pattern ~= XXXX // (use the value here from the tab step)
- note that there is collinearity *You can omit the variable that logit drops or drop another one.
Refit the model with the collinearity removed:
logit y x1
You may or may not want to include the covariate pattern that predicts outcome perfectly. It depends on the answer to (3). If the covariate pattern that predicts outcome perfectly is meaningful, you may want to exclude these observations from the model:
logit y x1 if pattern ~= XXXX
Here one would report
Covariate pattern such and such predicted outcome perfectly The best model for the rest of the data is ....xyz
来源:https://stackoverflow.com/questions/44371631/stata-drops-variables-that-predicts-failure-perfeclty-even-though-the-correlat