问题
Given a data matrix data of M dimensions and N samples, say,
data = randn(N, M);
I could compute the covariance matrix with
data_mu = data - ones(N, 1)*mean(data);
cov_matrix = (data_mu'*data_mu)./N
If I use the native MATLAB function
cov_matrix2 = cov(data)
this will always be equal to
cov_matrix = (data_mu'*data_mu)./(N-1)
That is, the denominator is (N - 1) is one less.
Why?? Can you reproduce it? Is this a bug??
I use MATLAB version 7.6.0.324 (2008).
回答1:
That is, the denominator is (N - 1) is one less. Why?? Can you reproduce it? Is this a bug??
See the cov documentation. It has to do with population variance vs. sample variance.
Note also that if you wish to use the denominator N instead of N-1, you can add a trailing 1 argument to the call, i.e. cov(x,y,1) or cov(x,1) as per the documentation.
回答2:
n-1 is the correct denominator to use in computation of variance. It is what's known as Bessel's correction (http://en.wikipedia.org/wiki/Bessel%27s_correction) Simply put, 1/(n-1) produces a more accurate expected estimate of the variance than 1/n.
来源:https://stackoverflow.com/questions/3256798/why-does-matlab-native-function-cov-covariance-matrix-computation-use-a-differ