Does tensorflow have something similar to scikit learn\'s one hot encoder for processing categorical data? Would using a placeholder of tf.string behave as categorical data
There are a couple ways to do it.
ans = tf.constant([[5, 6, 0, 0], [5, 6, 7, 0]]) #batch_size*max_seq_len
labels = tf.reduce_sum(tf.nn.embedding_lookup(np.identity(10), ans), 1)
>>> [[ 0. 0. 0. 0. 0. 1. 1. 0. 0. 0.]
>>> [ 0. 0. 0. 0. 0. 1. 1. 1. 0. 0.]]
The other way to do it is.
labels2 = tf.reduce_sum(tf.one_hot(ans, depth=10, on_value=1, off_value=0, axis=1), 2)
>>> [[0 0 0 0 0 1 1 0 0 0]
>>> [0 0 0 0 0 1 1 1 0 0]]