I have a problem that is similar to this question, but just different enough that it can\'t be solved with the same solution...
I\'ve got two dataframes,
By using get_dummies
pd.get_dummies(df1, prefix='', prefix_sep='').sum(level=0,axis=1).gt(0).sum().loc[df2.ID]
Out[614]:
jack 3
jill 5
jane 8
joe 9
ben 7
beatrice 6
dtype: int64
I think this should be fast ...
from itertools import chain
from collections import Counter
pd.Series(Counter(list(chain(*list(map(set,df1.values)))))).loc[df2.ID]