I have data of the following form:
df = pd.DataFrame({
\'group\': [1, 1, 2, 3, 3, 3, 4],
\'param\': [\'a\', \'a\', \'b\', np.nan, \'a\', \'a\', np.na
This is just an add-on to the solution in case you want to compute not only unique values but other aggregate functions:
df.groupby(['group']).agg(['min','max','count','nunique'])
Hope you find it useful
I think you can use SeriesGroupBy.nunique:
print (df.groupby('param')['group'].nunique())
param
a 2
b 1
Name: group, dtype: int64
Another solution with unique, then create new df
by DataFrame.from_records, reshape to Series
by stack and last value_counts:
a = df[df.param.notnull()].groupby('group')['param'].unique()
print (pd.DataFrame.from_records(a.values.tolist()).stack().value_counts())
a 2
b 1
dtype: int64
I know it has been a while since this was posted, but I think this will help too. I wanted to count unique values and filter the groups by number of these unique values, this is how I did it:
df.groupby('group').agg(['min','max','count','nunique']).reset_index(drop=False)