Recursive feature elimination and grid search using scikit-learn

后端 未结 2 690
小鲜肉
小鲜肉 2020-12-13 21:33

I would like to perform recursive feature elimination with nested grid search and cross-validation for each feature subset using scikit-learn. From the RFECV

相关标签:
2条回答
  • The code provided by DavidS did not work for me (sklearn 0.18), but required a small change to specific the param_grid, and its usage.

    from sklearn.datasets import make_friedman1
    from sklearn.feature_selection import RFECV
    from sklearn.model_selection import GridSearchCV
    from sklearn.svm import SVR
    X, y = make_friedman1(n_samples=50, n_features=10, random_state=0)
    param_grid = [{'estimator__C': [0.01, 0.1, 1.0, 10.0, 100.0, 1000.0]}]
    estimator = SVR(kernel="linear")
    selector = RFECV(estimator, step=1, cv=4)
    clf = GridSearchCV(selector, param_grid, cv=7)
    clf.fit(X, y)
    clf.best_estimator_.estimator_
    clf.best_estimator_.grid_scores_
    clf.best_estimator_.ranking_
    
    0 讨论(0)
  • 2020-12-13 22:03

    Unfortunately, RFECV is limited to cross-validating the number of components. You can not search over the parameters of the SVM with it. The error is because SVC is expecting a float as C, and you gave it a list.

    You can do one of two things: Run GridSearchCV on RFECV, which will result in splitting the data into folds two times (ones inside GridSearchCV and once inside RFECV), but the search over the number of components will be efficient, OR you could do GridSearchCV just on RFE, which would result in a single splitting of the data, but in very inefficient scanning of the parameters of the RFE estimator.

    If you would like to make the docstring less ambiguous, a pull request would be welcome :)

    0 讨论(0)
提交回复
热议问题