When doing groupby counts over multiple columns I get an error. Here is my dataframe and also an example that simply labels the distinct \'b\' and \'c\' groups.
Evaluate df.groupby(['b', 'c']).count()
in an interactive session:
In [150]: df.groupby(['b', 'c']).count()
Out[150]:
a b c d
b c
0 0 1 1 1 1
1 1 1 1 1
1 1 2 2 2 2
This is a whole DataFrame. It is probably not what you want to assign to a new column of df
(in fact, you can not assign a column to a DataFrame, which is why an albeit cryptic exception is raised.).
If you wish to create a new column which counts the number of rows in each group, you could use
df['gr'] = df.groupby(['b', 'c'])['a'].transform('count')
For example,
import pandas as pd
import numpy as np
np.random.seed(1)
df = pd.DataFrame(np.random.randint(0, 2, (4, 4)),
columns=['a', 'b', 'c', 'd'])
print(df)
# a b c d
# 0 1 1 0 0
# 1 1 1 1 1
# 2 1 0 0 1
# 3 0 1 1 0
df['gr'] = df.groupby(['b', 'c'])['a'].transform('count')
df['comp_ids'] = df.groupby(['b', 'c']).grouper.group_info[0]
print(df)
yields
a b c d gr comp_ids
0 1 1 0 0 1 1
1 1 1 1 1 2 2
2 1 0 0 1 1 0
3 0 1 1 0 2 2
Notice that df.groupby(['b', 'c']).grouper.group_info[0]
is returning something other than the counts of the number of rows in each group. Rather, it is returning a label for each group.