问题
I'm working with Python and I've implemented the PCA using this tutorial.
Everything works great, I got the Covariance I did a successful transform, brought it make to the original dimensions not problem.
But how do I perform whitening? I tried dividing the eigenvectors by the eigenvalues:
S, V = numpy.linalg.eig(cov)
V = V / S[:, numpy.newaxis]
and used V to transform the data but this led to weird data values. Could someone please shred some light on this?
回答1:
Here's a numpy implementation of some Matlab code for matrix whitening I got from here.
import numpy as np
def whiten(X,fudge=1E-18):
   # the matrix X should be observations-by-components
   # get the covariance matrix
   Xcov = np.dot(X.T,X)
   # eigenvalue decomposition of the covariance matrix
   d, V = np.linalg.eigh(Xcov)
   # a fudge factor can be used so that eigenvectors associated with
   # small eigenvalues do not get overamplified.
   D = np.diag(1. / np.sqrt(d+fudge))
   # whitening matrix
   W = np.dot(np.dot(V, D), V.T)
   # multiply by the whitening matrix
   X_white = np.dot(X, W)
   return X_white, W
You can also whiten a matrix using SVD:
def svd_whiten(X):
    U, s, Vt = np.linalg.svd(X, full_matrices=False)
    # U and Vt are the singular matrices, and s contains the singular values.
    # Since the rows of both U and Vt are orthonormal vectors, then U * Vt
    # will be white
    X_white = np.dot(U, Vt)
    return X_white
The second way is a bit slower, but probably more numerically stable.
回答2:
If you use python's scikit-learn library for this, you can just set the inbuilt parameter
from sklearn.decomposition import PCA
pca = PCA(whiten=True)
whitened = pca.fit_transform(X)
check the documentation.
回答3:
I think you need to transpose V and take the square root of S. So the formula is
matrix_to_multiply_with_data = transpose( v ) * s^(-1/2 )
来源:https://stackoverflow.com/questions/6574782/how-to-whiten-matrix-in-pca