Tensorflow One Hot Encoder?

后端未结

关注

 15  1721

一生所求 2020-12-04 13:46

Does tensorflow have something similar to scikit learn\'s one hot encoder for processing categorical data? Would using a placeholder of tf.string behave as categorical data

15条回答

半阙折子戏 (楼主)

2020-12-04 14:30

As mentioned above by @dga, Tensorflow has tf.one_hot now:

labels = tf.constant([5,3,2,4,1])
highest_label = tf.reduce_max(labels)
labels_one_hot = tf.one_hot(labels, highest_label + 1)

array([[ 0.,  0.,  0.,  0.,  0.,  1.],
       [ 0.,  0.,  0.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.]], dtype=float32)

You need to specify depth, otherwise you'll get a pruned one-hot tensor.

If you like to do it manually:

labels = tf.constant([5,3,2,4,1])
size = tf.shape(labels)[0]
highest_label = tf.reduce_max(labels)
labels_t = tf.reshape(labels, [-1, 1])
indices = tf.reshape(tf.range(size), [-1, 1])
idx_with_labels = tf.concat([indices, labels_t], 1)
labels_one_hot = tf.sparse_to_dense(idx_with_labels, [size, highest_label + 1], 1.0)

array([[ 0.,  0.,  0.,  0.,  0.,  1.],
       [ 0.,  0.,  0.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.]], dtype=float32)

Note arguments order in tf.concat()

0 讨论(0)

查看其它15个回答