Standard deviation of a list

匿名 (未验证) 提交于 2019-12-03 01:58:03

问题:

I want to find mean and standard deviation of 1st, 2nd,... digits of several (Z) lists. For example, I have

A_rank=[0.8,0.4,1.2,3.7,2.6,5.8] B_rank=[0.1,2.8,3.7,2.6,5,3.4] C_Rank=[1.2,3.4,0.5,0.1,2.5,6.1] # etc (up to Z_rank )... 

Now I want to take the mean and std of *_Rank[0], the mean and std of *_Rank[1], etc.
(ie: mean and std of the 1st digit from all the (A..Z)_rank lists;
the mean and std of the 2nd digit from all the (A..Z)_rank lists;
the mean and std of the 3rd digit...; etc).

回答1:

I would put A_Rank et al into a 2D NumPy array, and then use numpy.mean() and numpy.std() to compute the means and the standard deviations:

In [17]: import numpy  In [18]: arr = numpy.array([A_rank, B_rank, C_rank])  In [20]: numpy.mean(arr, axis=0) Out[20]:  array([ 0.7       ,  2.2       ,  1.8       ,  2.13333333,  3.36666667,         5.1       ])  In [21]: numpy.std(arr, axis=0) Out[21]:  array([ 0.45460606,  1.29614814,  1.37355985,  1.50628314,  1.15566239,         1.2083046 ]) 


回答2:

Since Python 3.4 / PEP450 there is a statistics module in the standard library, which has a method stdev for calculating the standard deviation of iterables like yours:

>>> A_rank = [0.8, 0.4, 1.2, 3.7, 2.6, 5.8] >>> import statistics >>> statistics.stdev(A_rank) 2.0634114147853952 


回答3:

Here's some pure-Python code you can use to calculate the mean and standard deviation.

All code below is based on the statistics module in Python 3.4+.

def mean(data):     """Return the sample arithmetic mean of data."""     n = len(data)     if n 

Note: for improved accuracy when summing floats, the statistics module uses a custom function _sum rather than the built-in sum which I've used in its place.

Now we have for example:

>>> mean([1, 2, 3]) 2.0 >>> stddev([1, 2, 3]) # population standard deviation 0.816496580927726 >>> stddev([1, 2, 3], ddof=1) # sample standard deviation 0.1 


回答4:

In Python 2.7.1, you may calculate standard deviation using numpy.std() for:

  • Population std: Just use numpy.std() with no additional arguments besides to your data list.
  • Sample std: You need to pass ddof (i.e. Delta Degrees of Freedom) set to 1, as in the following example:

numpy.std(, ddof=1)

The divisor used in calculations is N - ddof, where N represents the number of elements. By default ddof is zero.

It calculates sample std rather than population std.



回答5:

In python 2.7 you can use NumPy's numpy.std() gives the population standard deviation.

In Python 3.4 statistics.stdev() returns the sample standard deviation. The pstdv() function is the same as numpy.std().



回答6:

pure python code:

from math import sqrt  def stddev(lst):     mean = float(sum(lst)) / len(lst)     return sqrt(float(reduce(lambda x, y: x + y, map(lambda x: (x - mean) ** 2, lst))) / len(lst)) 


回答7:

The other answers cover how to do std dev in python sufficiently, but no one explains how to do the bizarre traversal you've described.

I'm going to assume A-Z is the entire population. If not see Ome's answer on how to inference from a sample.

So to get the standard deviation/mean of the first digit of every list you would need something like this:

#standard deviation numpy.std([A_rank[0], B_rank[0], C_rank[0], ..., Z_rank[0]])  #mean numpy.mean([A_rank[0], B_rank[0], C_rank[0], ..., Z_rank[0]]) 

To shorten the code and generalize this to any nth digit use the following function I generated for you:

def getAllNthRanks(n):     return [A_rank[n], B_rank[n], C_rank[n], D_rank[n], E_rank[n], F_rank[n], G_rank[n], H_rank[n], I_rank[n], J_rank[n], K_rank[n], L_rank[n], M_rank[n], N_rank[n], O_rank[n], P_rank[n], Q_rank[n], R_rank[n], S_rank[n], T_rank[n], U_rank[n], V_rank[n], W_rank[n], X_rank[n], Y_rank[n], Z_rank[n]]  

Now you can simply get the stdd and mean of all the nth places from A-Z like this:

#standard deviation numpy.std(getAllNthRanks(n))  #mean numpy.mean(getAllNthRanks(n)) 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!