How to use save model for prediction in python

◇◆丶佛笑我妖孽 提交于 2019-12-01 00:48:05

The vectorizer is part of your model. When you save your trained SVM model, you need to also save the corresponding vectorizer.

To make this more convenient, you can use Pipeline to construct a single "fittable" object that represents the steps needed to transform raw input to prediction output. In this case, the pipeline consists of a Tf-Idf extractor and an SVM classifier:

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn import svm
from sklearn.pipeline import Pipeline

vectorizer = TfidfVectorizer()
clf = svm.SVC()
tfidf_svm = Pipeline([('tfidf', vectorizer), ('svc', clf)])

documents, y = load_training_data()
tfidf_svm.fit(documents, y)

This way, only a single object needs to be persisted:

from sklearn.externals import joblib
joblib.dump(tfidf_svm, 'model.pkl')

To apply the model on your testing document, load the trained pipeline and simply use its predict function as usual with raw document(s) as input.

I was redirected here based on the search "How to use saved model for prediction?". So just to add to YS-L's answer, the final step.

Saving the model

from sklearn.externals import joblib
joblib.dump(fittedModel, 'name.model')

Load the saved model and predict

fittedModel = joblib.load('name.model')
fittedModel.predict(X_new)  # X_new is unseen example to be predicted

You can simply use the clf.predict with the .apply and lambda

datad['Predictions']=datad['InputX'].apply(lambda x: unicode(clf.predict(count_vect.transform([x])))) 
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!