sklearn.LabelEncoder with never seen before values

后端 未结 12 1009
执笔经年
执笔经年 2020-11-27 10:37

If a sklearn.LabelEncoder has been fitted on a training set, it might break if it encounters new values when used on a test set.

The only solution I c

12条回答
  •  孤独总比滥情好
    2020-11-27 11:32

    I get the impression that what you've done is quite similar to what other people do when faced with this situation.

    There's been some effort to add the ability to encode unseen labels to the LabelEncoder (see especially https://github.com/scikit-learn/scikit-learn/pull/3483 and https://github.com/scikit-learn/scikit-learn/pull/3599), but changing the existing behavior is actually more difficult than it seems at first glance.

    For now it looks like handling "out-of-vocabulary" labels is left to individual users of scikit-learn.

提交回复
热议问题