Create dummies from column with multiple values in pandas

前端 未结 4 1147
说谎
说谎 2020-12-04 10:35

I am looking for for a pythonic way to handle the following problem.

The pandas.get_dummies() method is great to create dummies from a categorical colum

4条回答
  •  再見小時候
    2020-12-04 11:32

    You can generate the dummies dataframe with your raw data, isolate the columns that contains a given atom, and then store the result matches back to the atom column.

    df
    Out[28]: 
      label
    0     A
    1     B
    2     C
    3     D
    4   A*C
    5   C*D
    
    dummies = pd.get_dummies(df['label'])
    
    atom_col = [c for c in dummies.columns if '*' not in c]
    
    for col in atom_col:
        ...:     df[col] = dummies[[c for c in dummies.columns if col in c]].sum(axis=1)
        ...:     
    
    df
    Out[32]: 
      label  A  B  C  D
    0     A  1  0  0  0
    1     B  0  1  0  0
    2     C  0  0  1  0
    3     D  0  0  0  1
    4   A*C  1  0  1  0
    5   C*D  0  0  1  1
    

提交回复
热议问题