How to convert a column of string to numerical?

后端 未结 3 1216
暖寄归人
暖寄归人 2021-01-21 06:53

I have this pandas dataframe from a query:

|    name    |    event    |
----------------------------
| name_1     | event_1     |
| name_1     | event_2     |
|          


        
3条回答
  •  悲哀的现实
    2021-01-21 07:22

    You are asking for the pythonic ways , i think in python this way is to use a technic called one-hot encoding this technic is well implemented in libraries likes sklearn and after one hot encoding you will need to group your dataframe by the first column and apply sum function.

    here is a code :

    import pandas as pd #the useful libraries
    import numpy as np
    from sklearn.preprocessing import LabelBinarizer #form sklmearn
    dataset = pd.DataFrame([['name_1', 'event_1' ], ['name_1', 'event_2'], ['name_2', 'event_1']], columns=['name', 'event'], index=[1, 2, 3])
    data = dataset['event'] #just reproduce your dataframe
    enc = LabelBinarizer(neg_label=0)
    dataset['event_2'] = enc.fit_transform(data)
    event_two = dataset['event_2']
    dataset['event_1'] = (~event_two.astype(np.bool)).astype(np.int64) #this is a tip to reproduce the event_1 columns
    dataset = dataset.groupby('name').sum()
    dataset.reset_index(inplace=True)
    

    and the output is :

提交回复
热议问题