Count occurrences of items in Series in each row of a DataFrame

后端 未结 3 1836
一个人的身影
一个人的身影 2020-12-06 06:02

I have a pandas.DataFrame that looks like this.

COL1    COL2    COL3
C1      None    None
C1      C2      None
C1      C1      None
C1      C2           


        
3条回答
  •  庸人自扰
    2020-12-06 06:33

    Usually apply + serise function to whole dataframe will slowing down the whole process , Additional Reading : Link

    df.mask(df.eq('None')).stack().str.get_dummies().sum(level=0)
    Out[165]: 
       C1  C2  C3
    0   1   0   0
    1   1   1   0
    2   2   0   0
    3   1   1   1
    

    Or you can do with Counter

    from  collections import Counter
    
    pd.DataFrame([ Counter(x) for x in df.values]).drop('None',1)
    Out[170]: 
       C1   C2   C3
    0   1  NaN  NaN
    1   1  1.0  NaN
    2   2  NaN  NaN
    3   1  1.0  1.0
    

提交回复
热议问题