I've fit a Pipeline object with RandomizedSearchCV
pipe_sgd = Pipeline([('scl', StandardScaler()),
('clf', SGDClassifier(n_jobs=-1))])
param_dist_sgd = {'clf__loss': ['log'],
'clf__penalty': [None, 'l1', 'l2', 'elasticnet'],
'clf__alpha': np.linspace(0.15, 0.35),
'clf__n_iter': [3, 5, 7]}
sgd_randomized_pipe = RandomizedSearchCV(estimator = pipe_sgd,
param_distributions=param_dist_sgd,
cv=3, n_iter=30, n_jobs=-1)
sgd_randomized_pipe.fit(X_train, y_train)
I want to access the coef_ attribute of the best_estimator_ but I'm unable to do that. I've tried accessing coef_ with the code below.
sgd_randomized_pipe.best_estimator_.coef_
However I get the following AttributeError...
AttributeError: 'Pipeline' object has no attribute 'coef_'
The scikit-learn docs say that coef_ is an attribute of SGDClassifier, which is the class of my base_estimator_.
What am I doing wrong?
You can always use the names you assigned to them while making the pipeline by using the named_steps dict.
scaler = sgd_randomized_pipe.best_estimator_.named_steps['scl']
classifier = sgd_randomized_pipe.best_estimator_.named_steps['clf']
and then access all the attributes like coef_, intercept_ etc. which are available to corresponding fitted estimator.
This is the formal attribute exposed by the Pipeline as specified in the documentation:
named_steps : dict
Read-only attribute to access any step parameter by user given name. Keys are step names and values are steps parameters.
I've found one way to do this is by chained indexing with the steps attribute...
sgd_randomized_pipe.best_estimator_.steps[1][1].coef_
Is this best practice, or is there another way?
I think this should work:
sgd_randomized_pipe.named_steps['clf'].coef_
来源:https://stackoverflow.com/questions/43856280/return-coefficients-from-pipeline-object-in-sklearn