Does tensorflow have something similar to scikit learn\'s one hot encoder for processing categorical data? Would using a placeholder of tf.string behave as categorical data
tf.one_hot() is available in TF and easy to use.
Lets assume you have 4 possible categories (cat, dog, bird, human) and 2 instances (cat, human). So your depth=4 and your indices=[0, 3]
import tensorflow as tf
res = tf.one_hot(indices=[0, 3], depth=4)
with tf.Session() as sess:
print sess.run(res)
Keep in mind that if you provide index=-1 you will get all zeros in your one-hot vector.
Old answer, when this function was not available.
After looking though the python documentation, I have not found anything similar. One thing that strengthen my belief that it does not exist is that in their own example they write one_hot manually.
def dense_to_one_hot(labels_dense, num_classes=10):
"""Convert class labels from scalars to one-hot vectors."""
num_labels = labels_dense.shape[0]
index_offset = numpy.arange(num_labels) * num_classes
labels_one_hot = numpy.zeros((num_labels, num_classes))
labels_one_hot.flat[index_offset + labels_dense.ravel()] = 1
return labels_one_hot
You can also do this in scikitlearn.