Grid Search and Early Stopping Using Cross Validation with XGBoost in SciKit-Learn

微笑、不失礼 提交于 2019-12-03 06:55:17

You could pass you early_stopping_rounds, and eval_set as an extra fit_params to GridSearchCV, and that would actually work. However, GridSearchCV will not change the fit_params between the different folds, so you would end up using the same eval_set in all the folds, which might not be what you mean by CV.

model=xgb.XGBClassifier()
clf = GridSearchCV(model, parameters,
                         fit_params={'early_stopping_rounds':20,\
                         'eval_set':[(X,y)]},cv=kfold)  

After some tweaking, I found the safest way to integrate early_stopping_rounds and the sklearn API is to implement an early_stopping mechanism your self. You can do it if you do a GridSearchCV with n_rounds as paramter to be tuned. You can then watch the mean_validation_score for the different models with increasing n_rounds. Then you can define a custom heuristic for early stop. it wont save the computational time needed to evaluate all the possible n_rounds though

I think it is also a better approach then using a single split hold-out for this purpose.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!