One-hot-encoding multiple columns in sklearn and naming columns

后端 未结 2 1895
猫巷女王i
猫巷女王i 2021-01-07 00:29

I have the following code to one-hot-encode 2 columns I have.

# encode city labels using one-hot encoding scheme
city_ohe = OneHotEncoder(categories=\'auto\'         


        
相关标签:
2条回答
  • 2021-01-07 01:07

    Why don't you take a look at pd.get_dummies? Here's how you can encode:

    df['city'] = df['city'].astype('category')
    df['phone'] = df['phone'].astype('category')
    df = pd.get_dummies(df)
    
    0 讨论(0)
  • 2021-01-07 01:31

    You you are almost there... Like you said you can add all the columns you want to encode in fit_transform directly.

    ohe = OneHotEncoder(categories='auto')
    feature_arr = ohe.fit_transform(df[['phone','city']]).toarray()
    feature_labels = ohe.categories_
    

    And then you just need to do the following:

    feature_labels = np.array(feature_labels).ravel()
    

    Which enables you to name your columns like you wanted:

    features = pd.DataFrame(feature_arr, columns=feature_labels)
    
    0 讨论(0)
提交回复
热议问题