Computing a mean confidence interval without storing all the data points

前端 未结 6 2029
后悔当初
后悔当初 2020-12-24 10:07

For large n (see below for how to determine what\'s large enough), it\'s safe to treat, by the central limit theorem, the distribution of the sample mean as nor

6条回答
  •  没有蜡笔的小新
    2020-12-24 10:29

    You don't want to accumulate the sum-of-squares. The resulting statistics are numerically inaccurate -- you'll end up subtracting two large, similar numbers. You want to maintain the variance, or (n-1)*variance, or something like that.

    The straightforward way is to accumulate the datapoints incrementally. The formula is not complicated or hard to derive (see John D. Cook's link).

    An even more accurate way to do it is to combine the datapoints pairwise-recursively. You can do this with memory logarithmic in n: register k holds statistics for 2^k older datapoints, which are combined with statistics for 2^k newer points to get statistics for 2^(k+1) points...

提交回复
热议问题