selection of features using PCA

前端未结

关注

 3  655

伪装坚强ぢ 2020-12-30 15:47

I am doing unsupervised classification. For this I have 8 features (Variance of Green, Std. div. of Green , Mean of Red, Variance of Red, Std. div. of Red, Mean of Hue, Vari

3条回答

梦毁少年i (楼主)

2020-12-30 16:36
From the pcacov docs:

COEFF is a p-by-p matrix, with each column containing coefficients for one principal component. The columns are in order of decreasing component variance.

Since explained shows that only the first component really contributes a significant amount to explained variance, you should look at the first column of PC to see which original features it uses:
```
0.0038
0.0755
0.7008 <---
0.0007 
0.0320 
0.7065 <---
0.0026 
0.0543 
```
It turns out, in your example, that the 3rd and 6th features (indicated with <-- ) are the main contributors to the first principal components. You could say that these features are the most important ones.

Similarly, based on the fact that the 1st, 4th and 7th features only get large weights in some of the last columns of PC, one could conclude that they are relatively unimportant.

However, for this sort of per-feature analysis, PCA might not be the best fit; you could derive such information from the standard deviations of the original features just as well.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...