Pass percentiles to pandas agg function

匿名 (未验证) 提交于 2019-12-03 01:10:02

问题:

I want to pass the numpy percentile() function through pandas' agg() function as I do below with various other numpy statistics functions.

Right now I have a dataframe that looks like this:

AGGREGATE   MY_COLUMN A           10 A           12 B           5 B           9 A           84 B           22 

And my code looks like this:

grouped = dataframe.groupby('AGGREGATE') column = grouped['MY_COLUMN'] column.agg([np.sum, np.mean, np.std, np.median, np.var, np.min, np.max]) 

The above code works, but I want to do something like

column.agg([np.sum, np.mean, np.percentile(50), np.percentile(95)]) 

i.e. specify various percentiles to return from agg()

How should this be done?

回答1:

Perhaps not super efficient, but one way would be to create a function yourself:

def percentile(n):     def percentile_(x):         return np.percentile(x, n)     percentile_.__name__ = 'percentile_%s' % n     return percentile_ 

Then include this in your agg:

In [11]: column.agg([np.sum, np.mean, np.std, np.median,                      np.var, np.min, np.max, percentile(50), percentile(95)]) Out[11]:            sum       mean        std  median          var  amin  amax  percentile_50  percentile_95 AGGREGATE A          106  35.333333  42.158431      12  1777.333333    10    84             12           76.8 B           36  12.000000   8.888194       9    79.000000     5    22             12           76.8 

Note sure this is how it should be done though...



回答2:

Being more specific, if you just want to aggregate your pandas groupby results using the percentile function, the python lambda function offers a pretty neat solution. Using the question's notation, aggregating by the percentile 95, should be:

dataframe.groupby('AGGREGATE').agg(lambda x: np.percentile(x['COL'], q = 95)) 

You can also assign this function to a variable and use it in conjunction with other aggregation functions.



回答3:

Try this for the 50% and 95% percentile:

column.describe( percentiles = [ 0.5, 0.95 ] ) 


易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!