问题
I have a XGBoost
model that I am using for some binary classification purpose. It makes use of some features namely f1, f2, f3, f4, f5, f6, f7
I want to make use of another LogisticRegression
model from sklearn
that makes use of the output of the model and a feature of XGBoost
model to make prediction ie it must take f1, out
to make the prediction. Where out
is the prediction made by the XGBoost
model.
I want to save these two model into a single file some how to make prediction in production.
How can I do that.?
回答1:
You would need a combination of FeatureUnion and Pipeline to achieve this.
Something like this:
final_classifier = Pipeline([
('features', FeatureUnion([
('f1', FeatureSelector()),
('out', XGBoostClassifierTransformer()),
])
),
('clf', LogisticRegression()),
])
Here, FeatureSelector()
and XGBoostClassifierTransformer()
are custom wrappers that you can easily make on your own. You need to implement the fit()
and transform()
methods with the output you want to send to the next part of the pipeline.
FeatureUnion will call transform()
on each of its internal parts and then combine the outputs. The pipeline will take this output and then send to next part, ie LogisticRegression.
This will look something like this.
X --> final_classifier, Pipeline
|
| <== X is passed to FeatureUnion
\/
features, FeatureUnion
|
| <== X is duplicated and passed to both parts
______________|__________________
| |
| |
\/ \/
f1, FeatureSelector out, XGBoostClassifierTransformer
| |
|<= Only f1 is selected from X | <= All features are used in XGBoost
| |
\/________________________________________\/
|
|
\/
clf, LogisticRegression
来源:https://stackoverflow.com/questions/52837559/use-predicted-probability-of-one-model-to-train-another-model-and-save-as-one-si