ensemble-learning

Unable to do Stacking for a Multi-label classifier

 ̄綄美尐妖づ 提交于 2021-01-28 19:12:39
问题 I am working on a multi-label text classification problem (Total target labels 90). The data distribution has a long tail and class imbalance and around 100k records. I am using the OAA strategy (One against all). I am trying to create an ensemble using Stacking. Text features : HashingVectorizer (number of features 2**20, char analyzer) TSVD to reduce the dimensionality (n_components=200). text_pipeline = Pipeline([ ('hashing_vectorizer', HashingVectorizer(n_features=2**20, analyzer='char'))

Feature importance in logistic regression with bagging classifier

做~自己de王妃 提交于 2021-01-28 00:10:27
问题 I am working on a binary classification problem which I am using the logistic regression within bagging classifer. Few lines of code are as follows:- model = BaggingClassifier(LogisticRegression(), n_estimators=10, bootstrap = True, random_state = 1) model.fit(X,y,sample_weights) I am intrested in knowing feature importance metric for this model. How can this be done if estimator for bagging classifer is logistic regression? I am able to get the feature importance when decision tree is used

VotingClassifier with pipelines as estimators

偶尔善良 提交于 2020-08-20 04:02:24
问题 I want to build an sklearn VotingClassifier ensemble out of multiple different models (Decision Tree, SVC, and a Keras Network). All of them need a different kind of data preprocessing, which is why I made a pipeline for each of them. # Define pipelines # DTC pipeline featuriser = Featuriser() dtc = DecisionTreeClassifier() dtc_pipe = Pipeline([('featuriser',featuriser),('dtc',dtc)]) # SVC pipeline scaler = TimeSeriesScalerMeanVariance(kind='constant') flattener = Flattener() svc = SVC(C =

VotingClassifier with pipelines as estimators

主宰稳场 提交于 2020-08-20 04:01:09
问题 I want to build an sklearn VotingClassifier ensemble out of multiple different models (Decision Tree, SVC, and a Keras Network). All of them need a different kind of data preprocessing, which is why I made a pipeline for each of them. # Define pipelines # DTC pipeline featuriser = Featuriser() dtc = DecisionTreeClassifier() dtc_pipe = Pipeline([('featuriser',featuriser),('dtc',dtc)]) # SVC pipeline scaler = TimeSeriesScalerMeanVariance(kind='constant') flattener = Flattener() svc = SVC(C =

matplotlib does not support generators as input

独自空忆成欢 提交于 2020-06-23 03:09:35
问题 i am running the notebook at this site https://github.com/vsmolyakov/experiments_with_python/blob/master/chp01/ensemble_methods.ipynb to practice ensemble methods with python, and getting an error when running this part of the code in python 3: plt.figure() (_, caps, _) = plt.errorbar(num_est, bg_clf_cv_mean, yerr=bg_clf_cv_std, c='blue', fmt='-o', capsize=5) for cap in caps: cap.set_markeredgewidth(1) plt.ylabel('Accuracy'); plt.xlabel('Ensemble Size'); plt.title('Bagging Tree Ensemble');

Sklearn Voting ensemble with models using different features and testing with k-fold cross validation

不问归期 提交于 2020-06-01 07:41:31
问题 I have a data frame with 4 different groups of features. I need to create 4 different models with these four different feature groups and combine them with the ensemble voting classifier. Furthermore, I need to test the classifier using k-fold cross validation. However, I am finding it difficult to combine different feature sets, voting classifier and k-fold cross validation with functionality available in sklearn. Following is the code that I have so far. y = df1.index x = preprocessing

sklearn Pipeline: argument of type 'ColumnTransformer' is not iterable

最后都变了- 提交于 2020-06-01 05:07:32
问题 I am attempting to use a pipeline to feed an ensemble voting classifier as I want the ensemble learner to use models that train on different feature sets. For this purpose, I followed the tutorial available at [1]. Following is the code that I could develop so far. y = df1.index x = preprocessing.scale(df1) phy_features = ['A', 'B', 'C'] phy_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')), ('scaler', StandardScaler())]) phy_processer = ColumnTransformer

sklearn Pipeline: argument of type 'ColumnTransformer' is not iterable

痞子三分冷 提交于 2020-06-01 05:07:27
问题 I am attempting to use a pipeline to feed an ensemble voting classifier as I want the ensemble learner to use models that train on different feature sets. For this purpose, I followed the tutorial available at [1]. Following is the code that I could develop so far. y = df1.index x = preprocessing.scale(df1) phy_features = ['A', 'B', 'C'] phy_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')), ('scaler', StandardScaler())]) phy_processer = ColumnTransformer

ValueError: A target array with shape (32, 3) was passed for an output of shape (None, 2) while using as loss `binary_crossentropy`. In Keras model

被刻印的时光 ゝ 提交于 2020-01-25 06:48:06
问题 I am trying to ensemble the Keras binary pre-trained models into one multi-class model by the voting system. Binary pre-trained models are trained on different classes each. To ensemble the model, I am referring to this blog for the same Here is the code for i in os.listdir(model_root): //loading all the models print(i) filename = model_root + "/" + i # load model model = load_model(filename, custom_objects={'KerasLayer': hub.KerasLayer}) models.append(model) print(len(models)) //3 #To fit