How is the complexity of PCA O(min(p^3,n^3))?

放肆的年华 提交于 2019-12-03 08:49:33

问题


I've been reading a paper on Sparse PCA, which is: http://stats.stanford.edu/~imj/WEBLIST/AsYetUnpub/sparse.pdf

And it states that, if you have n data points, each represented with p features, then, the complexity of PCA is O(min(p^3,n^3)).

Can someone please explain how/why?


回答1:


Covariance matrix computation is O(p2n); its eigen-value decomposition is O(p3). So, the complexity of PCA is O(p2n+p3).

O(min(p3,n3)) would imply that you could analyze a two-dimensional dataset of any size in fixed time, which is patently false.




回答2:


Assuming your dataset is $X \in \R^{nxp}$ where n: number of samples, d: dimensions of a sample, you are interested in the eigenanalysis of $X^TX$ which is the main computational cost of PCA. Now matrices $X^TX \in \R^{pxp}$ and $XX^T \in \R^{nxn}$ have the same min(n, p) non negative eigenvalues and eigenvectors. Assuming p less than n you can solve the eigenanalysis in $O(p^3)$. If p greater than n (for example in computer vision in many cases the dimensionality of sample -number of pixels- is greater than the number of samples available) you can perform eigenanalysis in $O(n^3)$ time. In any case you can get the eigenvectors of one matrix from the eigenvalues and eigenvectors of the other matrix and do that in $O(min(p, n)^3)$ time.

$$X^TX = V \Lambda V^T$$

$$XX^T = U \Lambda U^T$$

$$U = XV\Lambda^{-1/2}$$




回答3:


Below is michaelt's answer provided in both the original LaTeX and rendered as a PNG.

LaTeX code:

Assuming your dataset is $X \in R^{n\times p}$ where n: number of samples, p: dimensions of a sample, you are interested in the eigenanalysis of $X^TX$ which is the main computational cost of PCA. Now matrices $X^TX \in \R^{p \times p}$ and $XX^T \in \R^{n\times n}$ have the same min(n, p) non negative eigenvalues and eigenvectors. Assuming p less than n you can solve the eigenanalysis in $O(p^3)$. If p greater than n (for example in computer vision in many cases the dimensionality of sample -number of pixels- is greater than the number of samples available) you can perform eigenanalysis in $O(n^3)$ time. In any case you can get the eigenvectors of one matrix from the eigenvalues and eigenvectors of the other matrix and do that in $O(min(p, n)^3)$ time.



来源:https://stackoverflow.com/questions/20507646/how-is-the-complexity-of-pca-ominp3-n3

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!