how to enforce Monotonic Constraints in XGBoost with ScikitLearn?

时光总嘲笑我的痴心妄想 提交于 2019-12-22 10:55:55

问题


I build up a XGBoost model using scikit-learn and I am pretty happy with it. As fine tuning to avoid overfitting, I'd like to ensure monotonicity of some features but there I start facing some difficulties...

As far as I understood, there is no documentation in scikit-learn about xgboost (which I confess I am really surprised about - knowing that this situation is lasting for several months). The only documentation I found is directly on http://xgboost.readthedocs.io

On this website, I found out that monotonicity can be enforced using "monotone_constraints" option. I tried to use it in Scikit-Learn but I got an error message "TypeError: init() got an unexpected keyword argument 'monotone_constraints'"

Do you know a way to do it ?

Here is the code I wrote in python (using spyder):

grid = {'learning_rate' : 0.01, 'subsample' : 0.5, 'colsample_bytree' : 0.5,
    'max_depth' : 6, 'min_child_weight' : 10, 'gamma' : 1, 
    'monotone_constraints' : monotonic_indexes}
#'monotone_constraints' ~ = "(1,-1)"
m07_xgm06 = xgb.XGBClassifier(n_estimators=2000, **grid)
m07_xgm06.fit(X_train_v01_oe, Label_train, early_stopping_rounds=10, eval_metric="logloss", 
    eval_set=[(X_test1_v01_oe, Label_test1)])

回答1:


In order to do this using the xgboost sklearn API, you need to upgrade to xgboost 0.81. They fixed the ability to set parameters controlled via kwargs as part of this PR: https://github.com/dmlc/xgboost/pull/3791




回答2:


XGBoost Scikit-Learn API currently (0.6a2) doesn't support monotone_constraints. You can use Python API instead. Take a look into example.

This code in the example can be removed:

params_constr['updater'] = "grow_monotone_colmaker,prune"



回答3:


How would you expect monotone constraints to work for a general classification problem where the response might have more than 2 levels? All the examples I've seen relating to this functionality are for regression problems. If your classification response only has 2 levels, try switching to regression on an indicator variable and then choose an appropriate score threshold for classification.

This feature appears to work as of the latest xgboost / scikit-learn, provided that you use an XGBregressor rather than an XGBclassifier and set monotone_constraints via kwargs.

The syntax is like this:

params = {
    'monotone_constraints':'(-1,0,1)'
}

normalised_weighted_poisson_model = XGBRegressor(**params)

In this example, there is a negative constraint on column 1 in the training data, no constraint on column 2, and a positive constraint on column 3. It is up to you to keep track of which is which - you cannot refer to columns by name, only by position, and you must specify an entry in the constraint tuple for every column in your training data.



来源:https://stackoverflow.com/questions/43076451/how-to-enforce-monotonic-constraints-in-xgboost-with-scikitlearn

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!