How to use scikit-learn PCA for features reduction and know which features are discarded

前端 未结 3 1662
清歌不尽
清歌不尽 2021-01-30 17:37

I am trying to run a PCA on a matrix of dimensions m x n where m is the number of features and n the number of samples.

Suppose I want to preserve the nf fe

3条回答
  •  独厮守ぢ
    2021-01-30 18:38

    The features that your PCA object has determined during fitting are in pca.components_. The vector space orthogonal to the one spanned by pca.components_ is discarded.

    Please note that PCA does not "discard" or "retain" any of your pre-defined features (encoded by the columns you specify). It mixes all of them (by weighted sums) to find orthogonal directions of maximum variance.

    If this is not the behaviour you are looking for, then PCA dimensionality reduction is not the way to go. For some simple general feature selection methods, you can take a look at sklearn.feature_selection

提交回复
热议问题