Use predicted probability of one model to train another model and save as one single model

∥☆過路亽.° 提交于 2019-12-25 02:29:01

问题


I have a XGBoost model that I am using for some binary classification purpose. It makes use of some features namely f1, f2, f3, f4, f5, f6, f7

I want to make use of another LogisticRegression model from sklearn that makes use of the output of the model and a feature of XGBoost model to make prediction ie it must take f1, out to make the prediction. Where out is the prediction made by the XGBoost model.

I want to save these two model into a single file some how to make prediction in production.

How can I do that.?


回答1:


You would need a combination of FeatureUnion and Pipeline to achieve this.

Something like this:

final_classifier = Pipeline([
    ('features', FeatureUnion([
        ('f1', FeatureSelector()),
        ('out', XGBoostClassifierTransformer()),
     ])
    ),
    ('clf', LogisticRegression()),
])

Here, FeatureSelector() and XGBoostClassifierTransformer() are custom wrappers that you can easily make on your own. You need to implement the fit() and transform() methods with the output you want to send to the next part of the pipeline.

FeatureUnion will call transform() on each of its internal parts and then combine the outputs. The pipeline will take this output and then send to next part, ie LogisticRegression.

This will look something like this.

X --> final_classifier, Pipeline
            |
            |  <== X is passed to FeatureUnion
            \/
      features, FeatureUnion
                      |
                      |  <== X is duplicated and passed to both parts
        ______________|__________________
       |                                 |
       |                                 |                         
       \/                               \/
   f1, FeatureSelector                out, XGBoostClassifierTransformer
           |                                          |   
           |<= Only f1 is selected from X             | <= All features are used in XGBoost
           |                                          |
           \/________________________________________\/
                                      |
                                      |
                                     \/
                                   clf, LogisticRegression


来源:https://stackoverflow.com/questions/52837559/use-predicted-probability-of-one-model-to-train-another-model-and-save-as-one-si

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!