Vectorized way to count occurrences of string in either of two columns

后端 未结 4 696
一整个雨季
一整个雨季 2021-01-05 03:57

I have a problem that is similar to this question, but just different enough that it can\'t be solved with the same solution...

I\'ve got two dataframes,

4条回答
  •  南方客
    南方客 (楼主)
    2021-01-05 04:21

    By using get_dummies

    pd.get_dummies(df1, prefix='', prefix_sep='').sum(level=0,axis=1).gt(0).sum().loc[df2.ID]
    Out[614]: 
    jack        3
    jill        5
    jane        8
    joe         9
    ben         7
    beatrice    6
    dtype: int64
    

    I think this should be fast ...

    from itertools import chain
    from collections import Counter
    
    pd.Series(Counter(list(chain(*list(map(set,df1.values)))))).loc[df2.ID]
    

提交回复
热议问题