Pandas dataframe groupby to calculate population standard deviation

后端未结

关注

 2  1668

I am trying to use groupby and np.std to calculate a standard deviation, but it seems to be calculating a sample standard deviation (with a degrees of freedom equal to 1).

相关标签:

2条回答

爱一瞬间的悲伤

2021-02-20 15:26
For degree of freedom = 0

(This means that bins with one number will end up with std=0 instead of NaN)
```
import numpy as np


def std(x): 
    return np.std(x)


df.groupby('A').agg(['mean', 'max', std])
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

余生分开走

2021-02-20 15:40

You can pass additional args to np.std in the agg function:

In [202]:

df.groupby('A').agg(np.std, ddof=0)

Out[202]:
     B  values
A             
1  0.5     2.5
2  0.5     2.5

In [203]:

df.groupby('A').agg(np.std, ddof=1)

Out[203]:
          B    values
A                    
1  0.707107  3.535534
2  0.707107  3.535534

0 讨论(0)