How to go back from ONE-HOT-ENCODED labels to single column using sklearn?

為{幸葍}努か 提交于 2019-12-12 18:37:36

问题


I have predicted some data using model and getting this kind of results

[[0 0 0 ... 0 0 1]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 1]
 [0 0 0 ... 0 0 0]]

which are basically one-hot encoded labels of target column. Now I want to go somehow back to a single column of original values. I used these lines to do my encoding. How can I go back to sinle column?

le_candidate = LabelEncoder()
df['candidate_encoded'] = le_candidate.fit_transform(df.Candidate)
candidate_ohe = OneHotEncoder()
Y = candidate_ohe.fit_transform(df.candidate_encoded.values.reshape(-1, 1)).toarray()

回答1:


Use inverse_transform of LabelEncoder and OneHotEncoder:

import pandas as pd
from sklearn.preprocessing import LabelEncoder, OneHotEncoder

s = pd.Series(['a', 'b', 'c'])
le = LabelEncoder()
ohe = OneHotEncoder(sparse=False)
s = le.fit_transform(s)
s = ohe.fit_transform(s.reshape(-1,1))
print(s)

What you have:

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

What you should do:

inv_s = ohe.inverse_transform(s)
inv_s = le.inverse_transform(inv_s.astype(int).ravel())
inv_s

Output:

array(['a', 'b', 'c'], dtype=object)


来源:https://stackoverflow.com/questions/56266011/how-to-go-back-from-one-hot-encoded-labels-to-single-column-using-sklearn

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!