I want to encode 3 categorical features out of 10 features in my datasets. I use preprocessing from sklearn.preprocessing to do so as the following:
<
There is a simple fix if, like me, you get frustrated by this. Simply use Category Encoders' OneHotEncoder. This is a Sklearn Contrib package, so plays super nicely with the scikit-learn API.
This works as a direct replacement and does the boring label encoding for you.
from category_encoders import OneHotEncoder
cat_features = ['color', 'director_name', 'actor_2_name']
enc = OneHotEncoder(categorical_features=cat_features)
enc.fit(dataset.values)