Calculate Arbitrary Percentile on Pandas GroupBy

淺唱寂寞╮ 提交于 2019-12-17 23:42:17

问题


Currently there is a median method on the Pandas's GroupBy objects.

Is there is a way to calculate an arbitrary percentile (see: http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.percentile.html) on the groupings?

Median would be the calcuation of percentile with q=50.


回答1:


You want the quantile method:

In [47]: df
Out[47]: 
           A         B    C
0   0.719391  0.091693  one
1   0.951499  0.837160  one
2   0.975212  0.224855  one
3   0.807620  0.031284  one
4   0.633190  0.342889  one
5   0.075102  0.899291  one
6   0.502843  0.773424  one
7   0.032285  0.242476  one
8   0.794938  0.607745  one
9   0.620387  0.574222  one
10  0.446639  0.549749  two
11  0.664324  0.134041  two
12  0.622217  0.505057  two
13  0.670338  0.990870  two
14  0.281431  0.016245  two
15  0.675756  0.185967  two
16  0.145147  0.045686  two
17  0.404413  0.191482  two
18  0.949130  0.943509  two
19  0.164642  0.157013  two

In [48]: df.groupby('C').quantile(.95)
Out[48]: 
            A         B
C                      
one  0.964541  0.871332
two  0.826112  0.969558



回答2:


I found another useful solution here

If I have to use groupby another approach can be:

def percentile(n):
    def percentile_(x):
        return np.percentile(x, n)
    percentile_.__name__ = 'percentile_%s' % n
    return percentile_

Using the below call, I am able to achieve the same result as the solution given by @TomAugspurger

df.groupby('C').agg([percentile(50), percentile(95)])



来源:https://stackoverflow.com/questions/19894939/calculate-arbitrary-percentile-on-pandas-groupby

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!