Mapping one-hot encoded target values to proper label names

馋奶兔 提交于 2019-12-02 01:37:45

问题


I have a list of label names which I enuemrated and created a dictionary:

my_list = [b'airplane',
 b'automobile',
 b'bird',
 b'cat',
 b'deer',
 b'dog',
 b'frog',
 b'horse',
 b'ship',
 b'truck']

label_dict =dict(enumerate(my_list))


{0: b'airplane',
 1: b'automobile',
 2: b'bird',
 3: b'cat',
 4: b'deer',
 5: b'dog',
 6: b'frog',
 7: b'horse',
 8: b'ship',
 9: b'truck'}

Now I'm trying to cleaning map/apply the dict value to my target which is in an one-hot-encoded form.

y_test[0]

array([ 0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.])


y_test[0].map(label_dict) should return: 
'cat'

I was playing around with

(lambda key,value: value for y_test[0] == 1)

but couldn't come up with any concrete

Thank you.


回答1:


Since we are working with one-hot encoded array, argmax could be used to get the index for one off 1 for each row. Thus, using the list as input -

[my_list[i] for i in y_test.argmax(1)]

Or with np.take to have array output -

np.take(my_list,y_test.argmax(1))

To work with dict and assuming sequential keys as 0,1,.., we could have -

np.take(label_dict.values(),y_test.argmax(1))

If the keys are not essentially in sequence but sorted -

np.take(label_dict.values(), np.searchsorted(label_dict.keys(),y_test.argmax(1)))

Sample run -

In [79]: my_list
Out[79]: 
['airplane',
 'automobile',
 'bird',
 'cat',
 'deer',
 'dog',
 'frog',
 'horse',
 'ship',
 'truck']

In [80]: y_test
Out[80]: 
array([[ 0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.]])

In [81]: [my_list[i] for i in y_test.argmax(1)]
Out[81]: ['cat', 'automobile', 'ship']

In [82]: np.take(my_list,y_test.argmax(1))
Out[82]: 
array(['cat', 'automobile', 'ship'], 
      dtype='|S10')



回答2:


we can use dot product to reverse one-hot encoding, if it really is ONE-hot.

Let's start with factorizing your list

f, u = pd.factorize(my_list)

now if you have an array you'd like to get back your strings with

a = np.array([0, 0, 0, 1, 0, 0, 0, 0, 0, 0])

Then use dot

a.dot(u)

'cat'

Now assume

y_test = np.array([
        [0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
        [0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 1, 0]
    ])

Then

y_test.dot(u)

array(['cat', 'automobile', 'ship'], dtype=object)

If it isn't one-hot but instead multi-hot, you could join with commas

y_test = np.array([
        [0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
        [0, 1, 0, 0, 0, 0, 0, 0, 0, 1],
        [0, 0, 1, 0, 0, 0, 0, 0, 1, 0]
    ])

[', '.join(u[y.astype(bool)]) for y in y_test]


['cat', 'automobile, truck', 'bird, ship']


来源:https://stackoverflow.com/questions/44464280/mapping-one-hot-encoded-target-values-to-proper-label-names

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!