问题
Short Version:
Can StdDevs be added/combined? i.e.
if StdDev(11,14,16,17)=X and StdDev(21,34,43,12)=Y
can we calculate StdDev(11,14,16,17,21,34,43,12) from X & Y
Long Version:
I am designing a star schema. The schema has a fact_table (grain=transaction) which stores individual transaction response_time. The schema also has an aggregate_table (grain=day) which stores the response_time_sum per day.
In my report I need to calculate standard deviations of the response time for a given timedimension, say day, week, month etc. How can I calculate the StandardDeviation using the aggregate_table instead of touching the huge fact_table?
回答1:
Yes, you can combine them. You need to know the number of observations, mean, and standard deviation for each day. The variance is easier to work with than the standard deviation, so I'll express everything else in terms of variance. (Standard deviation is defined as the square root of the variance.)
Denote:
n[i] # observations for day i
m[i] # mean for day i
v[i] # variance for day i
You'll need to calculate the total number of observations N
and the overall mean M
. This is easy:
days = [day1, day2, ..., day_final]
N = sum(n[i] for i in days)
M = sum(n[i] * m[i] for i in days) / N
The overall variance V
is more complicated, but still can be calculated:
s1 = sum(n[i] * v[i] for i in days)
s2 = sum(n[i] * (m[i] - M)**2 for i in days)
V = (s1 + s2) / N
The above are for the population variance. If you instead have v[i]
as the sample variance, some minor modifications to s1
and V
are needed:
s1_sample = sum((n[i] - 1) * v[i] for i in days)
V_sample = (s1_sample + s2) / (N - 1)
回答2:
No, you can't add standard deviations.
Prove it to yourself with the numbers you provided:
X = 2.645751311, Y = 13.72345923
Standard deviation of combined set: 11.48912529
You can do a more general proof using the formula for standard deviation. You need the covariance of the two - scroll down to "identities":
http://en.wikipedia.org/wiki/Standard_deviation
来源:https://stackoverflow.com/questions/7753002/adding-combining-standard-deviations