selection of features using PCA

前端 未结 3 655
伪装坚强ぢ
伪装坚强ぢ 2020-12-30 15:47

I am doing unsupervised classification. For this I have 8 features (Variance of Green, Std. div. of Green , Mean of Red, Variance of Red, Std. div. of Red, Mean of Hue, Vari

3条回答
  •  梦毁少年i
    2020-12-30 16:36

    From the pcacov docs:

    COEFF is a p-by-p matrix, with each column containing coefficients for one principal component. The columns are in order of decreasing component variance.

    Since explained shows that only the first component really contributes a significant amount to explained variance, you should look at the first column of PC to see which original features it uses:

    0.0038
    0.0755
    0.7008 <---
    0.0007 
    0.0320 
    0.7065 <---
    0.0026 
    0.0543 
    

    It turns out, in your example, that the 3rd and 6th features (indicated with <-- ) are the main contributors to the first principal components. You could say that these features are the most important ones.

    Similarly, based on the fact that the 1st, 4th and 7th features only get large weights in some of the last columns of PC, one could conclude that they are relatively unimportant.

    However, for this sort of per-feature analysis, PCA might not be the best fit; you could derive such information from the standard deviations of the original features just as well.

提交回复
热议问题