Vectorized way to count occurrences of string in either of two columns

后端未结

关注

 4  705

一整个雨季 2021-01-05 03:57

I have a problem that is similar to this question, but just different enough that it can\'t be solved with the same solution...

I\'ve got two dataframes,

4条回答

南方客 (楼主)

2021-01-05 04:21

By using get_dummies

pd.get_dummies(df1, prefix='', prefix_sep='').sum(level=0,axis=1).gt(0).sum().loc[df2.ID]
Out[614]: 
jack        3
jill        5
jane        8
joe         9
ben         7
beatrice    6
dtype: int64

I think this should be fast ...

from itertools import chain
from collections import Counter

pd.Series(Counter(list(chain(*list(map(set,df1.values)))))).loc[df2.ID]

0 讨论(0)

查看其它4个回答