I\'m trying to perform a one hot encoding of a trivial dataset.
data = [[\'a\', \'dog\', \'red\']
[\'b\', \'cat\', \'green\']]
Wha
I've faced this problem many times and I found a solution in this book at his page 100 :
We can apply both transformations (from text categories to integer categories, then from integer categories to one-hot vectors) in one shot using the LabelBinarizer class:
and the sample code is here :
from sklearn.preprocessing import LabelBinarizer
encoder = LabelBinarizer()
housing_cat_1hot = encoder.fit_transform(data)
housing_cat_1hot
and as a result : Note that this returns a dense NumPy array by default. You can get a sparse matrix instead by passing sparse_output=True to the LabelBinarizer constructor.
And you can find more about the LabelBinarizer, here in the sklearn official documentation