ensemble-learning | 易学教程

`h2o.cbind` accepts only of H2OFrame objects - R

阅读更多关于 `h2o.cbind` accepts only of H2OFrame objects - R

问题 I'm trying to ensemble a random forest with logistic regresion with H2O in R. However, an error messages appears in the following code: > localH2O = h2o.init() Successfully connected to http://137.0.0.1:43329/ R is connected to the H2O cluster: H2O cluster uptime: 3 hours 11 minutes H2O cluster version: 3.2.0.3 H2O cluster name: H2O_started_from_R_toshiba_jvd559 H2O cluster total nodes: 1 H2O cluster total memory: 0.97 GB H2O cluster total cores: 4 H2O cluster allowed cores: 2 H2O cluster

Homogeneous vs heterogeneous ensembles

阅读更多关于 Homogeneous vs heterogeneous ensembles

问题 I would like to check with you if my understanding about ensemble learning (homogeneous vs heterogeneous) is correct. Is the following statement correct? An homogeneous ensemble is a set of classifiers of the same type built upon different data as random forest and an heterogeneous ensemble is a set of classifiers of different types built upon same data. If it's not correct, could you please clarify this point? 回答1: Homogeneous ensemble consists of members having a single-type base learning

How to merge keras sequential models with same input?

阅读更多关于 How to merge keras sequential models with same input?

问题 I am trying to create my first ensemble models in keras. I have 3 input values and a single output value in my dataset. from keras.optimizers import SGD,Adam from keras.layers import Dense,Merge from keras.models import Sequential model1 = Sequential() model1.add(Dense(3, input_dim=3, activation='relu')) model1.add(Dense(2, activation='relu')) model1.add(Dense(2, activation='tanh')) model1.compile(loss='mse', optimizer='Adam', metrics=['accuracy']) model2 = Sequential() model2.add(Dense(3,

How to handle categorical variables in sklearn GradientBoostingClassifier?

阅读更多关于 How to handle categorical variables in sklearn GradientBoostingClassifier?

问题 I am attempting to train models with GradientBoostingClassifier using categorical variables. The following is a primitive code sample, just for trying to input categorical variables into GradientBoostingClassifier . from sklearn import datasets from sklearn.ensemble import GradientBoostingClassifier import pandas iris = datasets.load_iris() # Use only data for 2 classes. X = iris.data[(iris.target==0) | (iris.target==1)] Y = iris.target[(iris.target==0) | (iris.target==1)] # Class 0 has

how does sklearn's Adaboost predict_proba works internally?

阅读更多关于 how does sklearn's Adaboost predict_proba works internally?

问题 I'm using sklearn's 'predict_proba()' to predict the probability of a sample belonging to a category for each estimators in Adaboost classifier. from sklearn.ensemble import AdaBoostClassifier clf = AdaBoostClassifier(n_estimators=50) for estimator in clf.estimators_: print estimator.predict_proba(X_test) Adaboost implements its predict_proba() like this: https://github.com/scikit-learn/scikit-learn/blob/bb39b49/sklearn/ensemble/weight_boosting.py#L733 DecisionTreeClassifier is sklearn's base

how does sklearn's Adaboost predict_proba works internally?

阅读更多关于 how does sklearn's Adaboost predict_proba works internally?

I'm using sklearn's 'predict_proba()' to predict the probability of a sample belonging to a category for each estimators in Adaboost classifier. from sklearn.ensemble import AdaBoostClassifier clf = AdaBoostClassifier(n_estimators=50) for estimator in clf.estimators_: print estimator.predict_proba(X_test) Adaboost implements its predict_proba() like this: https://github.com/scikit-learn/scikit-learn/blob/bb39b49/sklearn/ensemble/weight_boosting.py#L733 DecisionTreeClassifier is sklearn's base estimator for Adaboost classifier. DecisionTreeClassifier implements its predict_proba() like this:

xgb.plot.tree layout in r

阅读更多关于 xgb.plot.tree layout in r

I was reading a xgb notebook and the xgb.plot.tree command in example result in a pic like this: However when i do the same thing I got a pic like this which are two separate graphs and in different colors too. Is that normal? are the two graphs two trees? Jmi47 I have the same issue. According to an issue case on the xgboost github repository, this could be due to a change in the DiagrammeR library used by xgboost for rendering trees. https://github.com/dmlc/xgboost/issues/2640 Instead of modifying the dgr_graph object with diagrammeR commands, I chose to create a new version of the function

How to merge keras sequential models with same input?

阅读更多关于 How to merge keras sequential models with same input?

I am trying to create my first ensemble models in keras. I have 3 input values and a single output value in my dataset. from keras.optimizers import SGD,Adam from keras.layers import Dense,Merge from keras.models import Sequential model1 = Sequential() model1.add(Dense(3, input_dim=3, activation='relu')) model1.add(Dense(2, activation='relu')) model1.add(Dense(2, activation='tanh')) model1.compile(loss='mse', optimizer='Adam', metrics=['accuracy']) model2 = Sequential() model2.add(Dense(3, input_dim=3, activation='linear')) model2.add(Dense(4, activation='tanh')) model2.add(Dense(3, activation

Custom learner function for Adaboost

阅读更多关于 Custom learner function for Adaboost

I am using Adaboost to fit a classification problem. We can do the following: ens = fitensemble(X, Y, 'AdaBoostM1', 100, 'Tree') Now 'Tree' is the learner and we can change this to 'Discriminant' or 'KNN'. Each learner uses a certain Template Object Creation Function . More info here . Is it possible to create your own function and use it as a learner? And how? I open templateTree.m and templateKNN.m to see how MATLAB define Template Object Creation Function. function temp = templateKNN(varargin) classreg.learning.FitTemplate.catchType(varargin{:}); temp = classreg.learning.FitTemplate.make(

Ensemble of different kinds of regressors using scikit-learn (or any other python framework)

阅读更多关于 Ensemble of different kinds of regressors using scikit-learn (or any other python framework)

问题 I am trying to solve the regression task. I found out that 3 models are working nicely for different subsets of data: LassoLARS, SVR and Gradient Tree Boosting. I noticed that when I make predictions using all these 3 models and then make a table of 'true output' and outputs of my 3 models I see that each time at least one of the models is really close to the true output, though 2 others could be relatively far away. When I compute minimal possible error (if I take prediction from 'best'