Principal component analysis in Python

前端 未结 11 583
情深已故
情深已故 2020-11-30 16:49

I\'d like to use principal component analysis (PCA) for dimensionality reduction. Does numpy or scipy already have it, or do I have to roll my own using numpy.linalg.eigh?<

11条回答
  •  一整个雨季
    2020-11-30 17:13

    SVD should work fine with 460 dimensions. It takes about 7 seconds on my Atom netbook. The eig() method takes more time (as it should, it uses more floating point operations) and will almost always be less accurate.

    If you have less than 460 examples then what you want to do is diagonalize the scatter matrix (x - datamean)^T(x - mean), assuming your data points are columns, and then left-multiplying by (x - datamean). That might be faster in the case where you have more dimensions than data.

提交回复
热议问题