sklearn - How to retrieve PCA components and explained variance from inside a Pipeline passed to GridSearchCV

空扰寡人 提交于 2021-02-08 06:48:50

问题


I am using GridSearchCV with a pipeline as follows:

grid = GridSearchCV(
    Pipeline([
        ('reduce_dim', PCA()),
        ('classify', RandomForestClassifier(n_jobs = -1))
        ]),
    param_grid=[
        {
            'reduce_dim__n_components': range(0.7,0.9,0.1),
            'classify__n_estimators': range(10,50,5),
            'classify__max_features': ['auto', 0.2],
            'classify__min_samples_leaf': [40,50,60],
            'classify__criterion': ['gini', 'entropy']
        }
    ],
    cv=5, scoring='f1')

grid.fit(X,y)

How do I now retrieve PCA details like components and explained_variance from the grid.best_estimator_ model?

Furthermore, I also want to save the best_estimator_ to a file using pickle and later load it. How do I retrieve the PCA details from this loaded estimator? I suspect it will be the same as above.


回答1:


grid.best_estimator_ is to access the pipeline with the best parameters.

Now use named_steps[]attribute to access the internal estimators of the pipeline.

So grid.best_estimator_.named_steps['reduce_dim'] will give you the pca object. Now you can simply use this to access the components_ and explained_variance_ attibutes for this pca object like this:

grid.best_estimator_.named_steps['reduce_dim'].components_ grid.best_estimator_.named_steps['reduce_dim'].explained_variance_



来源:https://stackoverflow.com/questions/46800147/sklearn-how-to-retrieve-pca-components-and-explained-variance-from-inside-a-pi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!