Count occurrences of items in Series in each row of a DataFrame

后端 未结 3 1859
一个人的身影
一个人的身影 2020-12-06 06:02

I have a pandas.DataFrame that looks like this.

COL1    COL2    COL3
C1      None    None
C1      C2      None
C1      C1      None
C1      C2           


        
3条回答
  •  渐次进展
    2020-12-06 06:14

    You could apply value_counts:

    In [11]: df.apply(pd.Series.value_counts, axis=1)
    Out[11]: 
       C1  C2  C3  None
    0   1 NaN NaN     2
    1   1   1 NaN     1
    2   2 NaN NaN     1
    3   1   1   1   NaN
    

    So you can fill the NaN and applend just the base values you want:

    In [12]: df.apply(pd.Series.value_counts, axis=1)[['C1', 'C2', 'C3']].fillna(0)
    Out[12]: 
       C1  C2  C3
    0   1   0   0
    1   1   1   0
    2   2   0   0
    3   1   1   1
    

    Note: there's an open issue to have a value_counts method directly for a DataFrame (which I think should be introduced by pandas 0.15).

提交回复
热议问题