Scikit-Learn One-hot-encode before or after train/test split

后端未结

关注

 2  998

不要未来只要你来 2020-12-28 19:37

I am looking at two scenarios building a model using scikit-learn and I can not figure out why one of them is returning a result that is so fundamentally different than the

2条回答

遥遥无期 (楼主)

2020-12-28 20:21
I can't get your code to run, but my guess is that in the test dataset either
- you're not seeing all the levels of some of the categorical variables, and hence if you calculate your dummy variables just on this data, you'll actually have different columns.
- Otherwise, maybe you have the same columns but they're in a different order?
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...