predicitng new value through a model trained on one hot encoded data

人盡茶涼 提交于 2021-02-17 04:41:48

问题


This might look like a trivial problem. But I am getting stuck in predicting results from a model. My problem is like this:

I have a dataset of shape 1000 x 19 (except target feature) but after one hot encoding it becomes 1000 x 141. Since I trained the model on the data which is of shape 1000 x 141, so I need data of shape 1 x 141 (at least) for prediction. I also know in python, I can make future prediction using

model.predict(data)

But, since I am getting data from an end user through a web portal which is shape of 1 x 19. Now I am very confused how should I proceed further to make predictions based on the user data.

How can I convert data of shape 1 x 19 into 1 x 141 as I have to maintain the same order with respect to train/test data means the order of column should not differ? Any help in this direction would be highly appreciated.


回答1:


I am assuming that to create a one hot encoding, you are using sklearn onehotencoder. If you using that, then the problem should be solved easily. Since you are fitting the one hot encoder on your training data

from sklearn.preprocessing import OneHotEncoder
encoder = OneHotEncoder(categories = "auto", handle_unknown = "ignore")
X_train_encoded = encoder.fit_transform(X_train)

So now in the above code, your encoder is fitted on your training data so when you get the test data, you can transform it into the same encoded data using this fitted encoder.

test_data = encoder.transform(test_data)

Now your test data will also be of 1x141 shape. You can check shape using

(pd.DataFrame(test_data.toarray())).shape


来源:https://stackoverflow.com/questions/56133664/predicitng-new-value-through-a-model-trained-on-one-hot-encoded-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!