How to get number of groups in a groupby object in pandas?

懵懂的女人 提交于 2019-11-27 22:55:22

问题


This would be useful so I know how many unique groups I have to perform calculations on. Thank you.

Suppose groupby object is called dfgroup.


回答1:


As documented, you can get the number of groups with len(dfgroup).




回答2:


As of v0.23, there are a multiple options to use. First, the setup,

df = pd.DataFrame({'A': list('aabbcccd'), 'B': 'x'})
df

   A  B
0  a  x
1  a  x
2  b  x
3  b  x
4  c  x
5  c  x
6  c  x
7  d  x

g = df.groupby(['A'])

1) ngroups

Newer versions of the groupby API provide this (undocumented) attribute which stores the number of groups in a GroupBy object.

g.ngroups
# 6

Note that this is different from GroupBy.groups which actually returns the groups themselves:

g.groups
# {'a': Int64Index([0, 1], dtype='int64'),
#  'b': Int64Index([2, 3], dtype='int64'),
#  'c': Int64Index([4, 5, 6], dtype='int64'),
#  'd': Int64Index([7], dtype='int64')}  

2) len

As shown in BrenBarn's answer, you can either call len directly on the GroupBy object, or on the GroupBy.groups attribute (shown above).

len(g)
# 6

len(g.groups)    
# 6

This has been documented in GroupBy object attributes.

3) Generator Expression

For completeness, you can also iterate over the groupby object, counting each group explicitly:

sum(1 for _ in g)
# 6

But what if I actually want the size of each group?

You're in luck. We have a function for that, GroupBy.size.

g.size()

A
a    2
b    2
c    3
d    1
dtype: int64

Note that size counts NaNs as well. If you don't want NaNs counted, use GroupBy.count instead.



来源:https://stackoverflow.com/questions/27787930/how-to-get-number-of-groups-in-a-groupby-object-in-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!