SQL - STDEVP or STDEV and how to use it?

一曲冷凌霜 提交于 2019-11-29 16:04:02

问题


I have a table:

LocationId OriginalValue Mean
1          0.45         3.99  
2          0.33         3.99
3          16.74        3.99
4          3.31         3.99

and so forth...

How would I work out the Standard Deviation using this table and also what would you recommend - STDEVP or STDEV?


回答1:


To use it, simply:

SELECT STDEVP(OriginalValue)
FROM yourTable

From below, you probably want STDEVP.

From here:

STDEV is used when the group of numbers being evaluated are only a partial sampling of the whole population. The denominator for dividing the sum of squared deviations is N-1, where N is the number of observations ( a count of items in the data set ). Technically, subtracting the 1 is referred to as "non-biased."

STDEVP is used when the group of numbers being evaluated is complete - it's the entire population of values. In this case, the 1 is NOT subtracted and the denominator for dividing the sum of squared deviations is simply N itself, the number of observations ( a count of items in the data set ). Technically, this is referred to as "biased." Remembering that the P in STDEVP stands for "population" may be helpful. Since the data set is not a mere sample, but constituted of ALL the actual values, this standard deviation function can return a more precise result.




回答2:


Generally, you should use STDEV when you have to estimate standard deviation based on a sample. But if you have entire column-data given as arguments, then use STDEVP.

In general, if your data represents the entire population, use STDEVP; otherwise, use STDEV.

Note that for large samples, the functions return nearly the same value, so better use STDEV in this case.




回答3:


In statistics there are two types of standard deviations: one for a sample and one for a population. The sample standard deviation, generally notated by the letter s, is used as an estimate of the population standard deviation. The population standard deviation, generally notated by the Greek letter lower case sigma, is used when the data constitutes the complete population. It is difficult to answer your question directly -- sample or population -- because it is difficult to tell what you are working with: a sample or a population. It often depends on context. Consider the following example. If I want to know the standard deviation of the age of students in my class, then I u=would use STDEVP because the class is my population. But if I want the use my class as a sample of the population of all students in the school (this would be what is known as a convenience sample, and would likely be biased, but I digress), then I would use STDEV because my class is a sample. The resulting value would be my best estimate of STDEVP. As mentioned above (1) for large sample sizes (say, more than thirty), the difference between the two becomes trivial, and (2) generally you should use STDEV, not STDEVP, because in practice we usually don't have access to the population. Indeed, one could argue that if we always had access to populations, then we wouldn't need statistics. The entire point of inferential statistics is to be able to make inferences about a population based on the sample.



来源:https://stackoverflow.com/questions/14893912/sql-stdevp-or-stdev-and-how-to-use-it

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!