How can I avoid using estimator_params when using RFECV nested within GridSearchCV?

本秂侑毒 提交于 2019-11-27 06:26:52

问题


I'm currently working on recursive feature elimination (RFECV) within a grid search (GridSearchCV) for tree based methods using scikit-learn. To do this, I'm using the current dev version on GitHub (0.17) which allows RFECV to use feature importance from the tree methods to select features to discard.

For clarity this means:

  • loop over hyperparameters for the current tree method
  • for each set of parameters perform recursive feature elimination to obtain the optimal number of features
  • report the 'score' (e.g. accuracy)
  • determine which set of parameters produced the best score

This code is working fine at the moment - but I'm getting a depreciation warning about using estimator_params. Here is the current code:

# set up list of parameter dictionaries (better way to do this?)
depth = [1, 5, None]
weight = ['balanced', None]
params = []

for d in depth:
    for w in weight:
    params.append(dict(max_depth=d, 
                       class_weight=w))

# specify the classifier
estimator = DecisionTreeClassifier(random_state=0, 
                                   max_depth=None, 
                                   class_weight='balanced')

# specify the feature selection method
selector = RFECV(estimator,
                 step=1, 
                 cv=3, 
                 scoring='accuracy')

# set up the parameter search
clf = GridSearchCV(selector, 
                   {'estimator_params': param_grid}, 
                   cv=3)

clf.fit(X_train, y_train)

clf.best_estimator_.estimator_

Here is the depreciation warning in full:

home/csw34/git/scikit-learn/sklearn/feature_selection/rfe.py:154: DeprecationWarning:

The parameter 'estimator_params' is deprecated as of version 0.16 and will be removed in 0.18. The parameter is no longer necessary because the value is set via the estimator initialisation or set_params method.

How I would be able to achieve the same result without using estimator_params in GridSearchCV to pass the parameters through RFECV to the estimator?


回答1:


This solves your problem:

params = {'estimator__max_depth': [1, 5, None],
          'estimator__class_weight': ['balanced', None]}
estimator = DecisionTreeClassifier()
selector = RFECV(estimator, step=1, cv=3, scoring='accuracy')
clf = GridSearchCV(selector, params, cv=3)
clf.fit(X_train, y_train)
clf.best_estimator_.estimator_

To see more, use:

print(selector.get_params())


来源:https://stackoverflow.com/questions/31784392/how-can-i-avoid-using-estimator-params-when-using-rfecv-nested-within-gridsearch

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!