“Perfect separation” error when using Matcher from pymatch (Propensity score matching)

六月ゝ 毕业季﹏ 提交于 2020-03-03 07:02:07

问题


I am trying to use the pymatch package but I keep getting the error Error: Perfect separation detected, results not available. I checked multiple times, my dataset is not equal. It contains 260k rows for Control and 50k for treatment and has different averages. I only have 5 variables, all integers or Floats rounded to 2 decimals.

My goal is to match some treated customers to non-treated customers for further analysis based on propensity score matching.

I already removed outliers as apparently it cannot handle these well. I also rounded Decimals to 2 positions after the comma. I tried using only 2 variables of the whole dataset. Nothing worked.

d = {'Customer': ['A','B','C','D'], 'Basket_Size': [30, 40,25,30], 'Miles_away': [5.2, 15.4,16.3,7.2], 'was_treated': [1, 0,0,1]}
df = pd.DataFrame(data=d)
df

test = df[df.was_treated== 1]
control = df[df.was_treated== 0]

m = Matcher(test, control, yvar="was_treated", exclude=['Customer'])
## until here it runs perfectly fine

# output:

#Formula:
#was_treated~ Basket_Size+Miles_away
#n majority: 2
#n minority: 2


## this now throws the error
np.random.seed(20170925)
m.fit_scores(balance=True, nmodels=20)

# output: 
# Error: Perfect separation detected, results not available
# Fitting Models on Balanced Samples: 1\20

I expect an output like Average Accuracy: 78% but I get Average Accuracy: nan% and the error Error: Perfect separation detected, results not available


回答1:


I solved the issue myself. By the nature of the data, one variable was affected by the treatment. Meaning that no datapoint in was_treated== 1 could have a Miles_away >10 and vice versa, no datapoint in was_treated== 0 could have Miles_away <10. This was the perfect separation. Excluding this variable from the propensity scoring solved the issue.


    m = Matcher(test, control, yvar="fast_delivery", exclude=['CUSTOMER_NUMBER','Miles_away'])


来源:https://stackoverflow.com/questions/56786211/perfect-separation-error-when-using-matcher-from-pymatch-propensity-score-mat

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!