Why does MATLAB native function cov (covariance matrix computation) use a different divisor than I expect?

社会主义新天地 提交于 2019-12-08 02:47:38

问题


Given a data matrix data of M dimensions and N samples, say,

data = randn(N, M);

I could compute the covariance matrix with

data_mu = data - ones(N, 1)*mean(data);
cov_matrix = (data_mu'*data_mu)./N

If I use the native MATLAB function

cov_matrix2 = cov(data)

this will always be equal to

cov_matrix = (data_mu'*data_mu)./(N-1)

That is, the denominator is (N - 1) is one less.

Why?? Can you reproduce it? Is this a bug??

I use MATLAB version 7.6.0.324 (2008).


回答1:


That is, the denominator is (N - 1) is one less. Why?? Can you reproduce it? Is this a bug??

See the cov documentation. It has to do with population variance vs. sample variance.

Note also that if you wish to use the denominator N instead of N-1, you can add a trailing 1 argument to the call, i.e. cov(x,y,1) or cov(x,1) as per the documentation.




回答2:


n-1 is the correct denominator to use in computation of variance. It is what's known as Bessel's correction (http://en.wikipedia.org/wiki/Bessel%27s_correction) Simply put, 1/(n-1) produces a more accurate expected estimate of the variance than 1/n.



来源:https://stackoverflow.com/questions/3256798/why-does-matlab-native-function-cov-covariance-matrix-computation-use-a-differ

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!