Combining feature extraction classes in scikit-learn

a 夏天 提交于 2019-11-29 07:40:19

问题


I'm using sklearn.pipeline.Pipeline to chain feature extractors and a classifier. Is there a way to combine multiple feature selection classes (for example the ones from sklearn.feature_selection.text) in parallel and join their output?

My code right now looks as follows:

pipeline = Pipeline([
    ('vect', CountVectorizer()),
    ('tfidf', TfidfTransformer()),
    ('clf', SGDClassifier())])

It results in the following:

vect -> tfidf -> clf

I want to be able to specify a pipeline that looks as follows:

vect1 -> tfidf1 \
                 -> clf
vect2 -> tfidf2 /

回答1:


This has been implemented recently in the master branch of scikit-learn under the name FeatureUnion:

http://scikit-learn.org/dev/modules/pipeline.html#feature-union



来源:https://stackoverflow.com/questions/12721486/combining-feature-extraction-classes-in-scikit-learn

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!