PCA decomposition with python: features relevances

大憨熊 提交于 2019-12-13 16:30:04

问题


I'm following now next topic: How can I use PCA/SVD in Python for feature selection AND identification? Now, we decompose our data set in Python with PCA method and use for this the sklearn.decomposition.PCA With the usage of attributes components_ we get all components. Now we have very similar goal: want take only first several components (this part is not a problem) and see, what the input features proportions has every PCA component (to know, which features are much important for us). How is possible to do it? Another question is, has the python lybrary another implementations of Principal Component Analysis?


回答1:


what the input features proportions has every PCA component (to know, which features are much important for us). How is possible to do it?

The components_ array has shape (n_components, n_features) so components_[i, j] is already giving you the (signed) weights of the contribution of feature j to component i.

If you want to get the indices of the top 3 features contributing to component i irrespective of the sign, you can do:

numpy.abs(pca.component_[i]).argsort()[::-1][:3]

Note: the [::-1] notation makes it possible to reverse the order of an array:

>>> import numpy as np
>>> np.array([1, 2, 3])[::-1]
array([3, 2, 1])

Another question is, has the python library another implementations of Principal Component Analysis?

PCA is just a truncated Singular Value Decomposition of the centered dataset. You can use numpy.linalg.svd directly if you wish. Have a look at the soure code of the scikit-learn implementation of PCA for details.



来源:https://stackoverflow.com/questions/22348668/pca-decomposition-with-python-features-relevances

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!