I want to count the occurrence of a string in a grouped pandas dataframe column.
Assume I have the following Dataframe:
catA catB scores
A
Call apply on the 'scores' column on the groupby object and use the vectorise str method contains, use this to filter the group
and call count:
In [34]:
df.groupby(['catA', 'catB'])['scores'].apply(lambda x: x[x.str.contains('RET')].count())
Out[34]:
catA catB
A X 1
Y 1
B Z 2
Name: scores, dtype: int64
To assign as a column use transform so that the aggregation returns a series with it's index aligned to the original df:
In [35]:
df['count'] = df.groupby(['catA', 'catB'])['scores'].transform(lambda x: x[x.str.contains('RET')].count())
df
Out[35]:
catA catB scores count
0 A X 6-4 RET 1
1 A X 6-4 6-4 1
2 A Y 6-3 RET 1
3 B Z 6-0 RET 2
4 B Z 6-1 RET 2