Computing a mean confidence interval without storing all the data points

前端 未结 6 2011
后悔当初
后悔当初 2020-12-24 10:07

For large n (see below for how to determine what\'s large enough), it\'s safe to treat, by the central limit theorem, the distribution of the sample mean as nor

6条回答
  •  执念已碎
    2020-12-24 10:41

       sigma = sqrt( (q - (s*s/n)) / (n-1) )
       delta = t(1-c/2,n-1) * sigma / sqrt(n)
    

    Where t(x, n-1) is the t- distribution with n-1 degrees of freedom. if you are using gsl

    t = gsl_cdf_tdist_Qinv (c/2.0, n-1)
    

    There's no need to store any data beyond the sum of squares. Now, you might have a numerical issue because the sum-of-squares can be quite large. You could use the alternate definition of s

    sigma = sqrt(sum(( x_i - s/n )^2 / (n-1)))
    

    and make two passes. I would encourage you to consider using gnu scientific library or a package like R to help you avoid numerical issues. Also, be careful about your use of the central limit theorem. Abuse of it is partially to blame for the whole financial crisis going on right now.

提交回复
热议问题