Custom sort order function for groupby pandas python

大憨熊 提交于 2020-01-03 05:22:29

问题


Let's say I have a grouped dataframe like the below (which was obtained through an initial df.groupby(df["A"]).apply(some_func) where some_func returns a dataframe itself). The second column is the second level of the multiindex which was created by the groupby.

A   B C
1 0 1 8
  1 3 3
2 0 1 2
  1 2 2
3 0 1 3
  1 2 4

And I would like to order on the result of a custom function that I apply to the groups.

Let's assume for this example that the function is

def my_func(group):
    return sum(group["B"]*group["C"])

I would then like the result of the sort operation to return

A   B C
2 0 1 2
  1 2 2
3 0 1 3
  1 2 4
1 0 1 8
  1 3 3

回答1:


This is based on @Wen-Ben's excellent answer, but uses sort_values to maintain the intra/inter group orders.

df['func'] = (groups.apply(my_func)
              .reindex(df.index.get_level_values(0))
              .values)

(df.reset_index()
 .sort_values(['func','A','i'])
 .drop('func', axis=1)
 .set_index(['A','i']))

Note: the default algorithm for idx.argsort(), quicksort, is not stable. That's why @Wen-Ben's answer fails for complicated datasets. You can use idx.argsort(kind='mergesort') for a stable sort, i.e., maintaining the original order in case of tie values.




回答2:


IIUC reindex after apply your function then ,do with argsort

idx=df.groupby('A').apply(my_func).reindex(df.index.get_level_values(0))
df.iloc[idx.argsort()]
Out[268]: 
     B  C
A       
2 0  1  2
  1  2  2
3 0  1  3
  1  2  4
1 0  1  8
  1  3  3


来源:https://stackoverflow.com/questions/56033073/custom-sort-order-function-for-groupby-pandas-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!