How to get number of groups in a groupby object in pandas?

只谈情不闲聊 提交于 2019-11-29 09:03:20

As documented, you can get the number of groups with len(dfgroup).

As of v0.23, there are a multiple options to use. First, the setup,

df = pd.DataFrame({'A': list('aabbcccd'), 'B': 'x'})
df

   A  B
0  a  x
1  a  x
2  b  x
3  b  x
4  c  x
5  c  x
6  c  x
7  d  x

g = df.groupby(['A'])

1) ngroups

Newer versions of the groupby API provide this (undocumented) attribute which stores the number of groups in a GroupBy object.

g.ngroups
# 6

Note that this is different from GroupBy.groups which actually returns the groups themselves:

g.groups
# {'a': Int64Index([0, 1], dtype='int64'),
#  'b': Int64Index([2, 3], dtype='int64'),
#  'c': Int64Index([4, 5, 6], dtype='int64'),
#  'd': Int64Index([7], dtype='int64')}  

2) len

As shown in BrenBarn's answer, you can either call len directly on the GroupBy object, or on the GroupBy.groups attribute (shown above).

len(g)
# 6

len(g.groups)    
# 6

This has been documented in GroupBy object attributes.

3) Generator Expression

For completeness, you can also iterate over the groupby object, counting each group explicitly:

sum(1 for _ in g)
# 6

But what if I actually want the size of each group?

You're in luck. We have a function for that, GroupBy.size.

g.size()

A
a    2
b    2
c    3
d    1
dtype: int64

Note that size counts NaNs as well. If you don't want NaNs counted, use GroupBy.count instead.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!