Scikit-learn's LabelBinarizer vs. OneHotEncoder

前端未结

关注

 4  566

陌清茗 2020-11-30 00:37

What is the difference between the two? It seems that both create new columns, which their number is equal to the number of unique categories in the feature. Then they assig

4条回答

迷失自我 (楼主)

2020-11-30 01:00

Scikitlearn suggests using OneHotEncoder for X matrix i.e. the features you feed in a model, and to use a LabelBinarizer for the y labels.

They are quite similar, except that OneHotEncoder could return a sparse matrix that saves a lot of memory and you won't really need that in y labels.

Even if you have a multi-label multi-class problem, you can use MultiLabelBinarizer for your y labels rather than switching to OneHotEncoder for multi hot encoding.

https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...