发表新帖

发表新帖

sklearn.LabelEncoder with never seen before values

后端未结

关注

 12  1014

执笔经年 2020-11-27 10:37

If a sklearn.LabelEncoder has been fitted on a training set, it might break if it encounters new values when used on a test set.

The only solution I c

12条回答

青春惊慌失措 (楼主)

2020-11-27 11:21
If someone is still looking for it, here is my fix.

Say you have
enc_list : list of variables names already encoded
enc_map : the dictionary containing variables from enc_list and corresponding encoded mapping
df : dataframe containing values of a variable not present in enc_map

This will work assuming you already have category "NA" or "Unknown" in the encoded values
```
for l in enc_list:  

    old_list = enc_map[l].classes_
    new_list = df[l].unique()
    na = [j for j in new_list if j not in old_list]
    df[l] = df[l].replace(na,'NA')
```
0 讨论(0)

查看其它12个回答
发布评论:

提交评论
- 加载中...

热议问题