ensemble-learning

Ensemble different datasets in R

女生的网名这么多〃 提交于 2020-01-06 04:45:10
问题 I am trying to combine signals from different models using the example described here . I have different datasets which predicts the same output. However, when I combine the model output in caretList , and ensemble the signals, it gives an error Error in check_bestpreds_resamples(modelLibrary) : Component models do not have the same re-sampling strategies Here is the reproducible example library(caret) library(caretEnsemble) df1 <- data.frame(x1 = rnorm(200), x2 = rnorm(200), y = as.factor

How would you interpret an ensemble tree model?

橙三吉。 提交于 2019-12-25 18:21:48
问题 In machine learning ensemble tree models such as random forest are common. This models consist of an ensemble of so called decision tree models. How can we analyse, however, what those models have specifically learned? 回答1: You cannot in this sense in what you can just plot simple decision tree. Only extremely simple models can be easily investigated. More complex methods require more complex tools, which are just approximations, general ideas of what to look for. So for ensembles you can try

Grid search on parameters inside the parameters of a BaggingClassifier

别说谁变了你拦得住时间么 提交于 2019-12-24 20:32:13
问题 This is a follow up on a question answered here, but I believe it deserves its own thread. In the previous question, we were dealing with “an Ensemble of Ensemble classifiers, where each has its own parameters.” Let's start with the example provided by MaximeKan in his answer: my_est = BaggingClassifier(RandomForestClassifier(n_estimators = 100, bootstrap = True, max_features = 0.5), n_estimators = 5, bootstrap_features = False, bootstrap = False, max_features = 1.0, max_samples = 0.6 ) Now

h2oensemble Error in value[[3L]](cond) : argument “training_frame” must be a valid H2O H2OFrame or id

若如初见. 提交于 2019-12-24 01:55:27
问题 While trying to run the example on H2OEnsemble found on http://learn.h2o.ai/content/tutorials/ensembles-stacking/index.html from within Rstudio, I encounter the following error: Error in value[3L] : argument "training_frame" must be a valid H2O H2OFrame or id after defining the ensemble fit <- h2o.ensemble(x = x, y = y, training_frame = train, family = family, learner = learner, metalearner = metalearner, cvControl = list(V = 5, shuffle = TRUE)) I installed the latest version of both h2o and

xgb.plot.tree layout in r

无人久伴 提交于 2019-12-22 10:44:39
问题 I was reading a xgb notebook and the xgb.plot.tree command in example result in a pic like this: However when i do the same thing I got a pic like this which are two separate graphs and in different colors too. Is that normal? are the two graphs two trees? 回答1: I have the same issue. According to an issue case on the xgboost github repository, this could be due to a change in the DiagrammeR library used by xgboost for rendering trees. https://github.com/dmlc/xgboost/issues/2640 Instead of

Custom learner function for Adaboost

こ雲淡風輕ζ 提交于 2019-12-22 03:48:50
问题 I am using Adaboost to fit a classification problem. We can do the following: ens = fitensemble(X, Y, 'AdaBoostM1', 100, 'Tree') Now 'Tree' is the learner and we can change this to 'Discriminant' or 'KNN'. Each learner uses a certain Template Object Creation Function . More info here. Is it possible to create your own function and use it as a learner? And how? 回答1: I open templateTree.m and templateKNN.m to see how MATLAB define Template Object Creation Function. function temp = templateKNN

Using sklearn voting ensemble with partial fit

蓝咒 提交于 2019-12-18 20:01:51
问题 Can someone please tell how to use ensembles in sklearn using partial fit. I don't want to retrain my model. Alternatively, can we pass pre-trained models for ensembling ? I have seen that voting classifier for example does not support training using partial fit. 回答1: The Mlxtend library has an implementation of VotingEnsemble which allows you to pass in pre-fitted models. For example if you have three pre-trained models clf1, clf2, clf3. The following code would work. from mlxtend.classifier

How to construct dataframe for time series data using ensemble learning methods

白昼怎懂夜的黑 提交于 2019-12-13 09:16:18
问题 I am trying to predict the Bitcoin price at t+5, i.e. 5 minutes ahead, using 11 technical indicators up to time t which can all be calculated from the open, high, low, close and volume values from the Bitcoin time series (see my full data set here). As far as I know, it is not necessary to manipulate the data frame when using algorithms like regression trees, support vector machines or artificial neural networks, but when using ensemble methods like random forests (RF) and Boosting, I heard

How to restart kernel in python through code in spyder to avoid repetitively and manually doing it?

限于喜欢 提交于 2019-12-11 17:21:21
问题 I am ensemble 2 models in syder. ( A RNN and an encoder-decoder both running on same data set). Each of them has 100 models saved in .h5 format. After loading and running RNN if I try to load models for Encoder-Decoder the system becomes slow. The solution that I found was to restart the kernel by pressing ctrl+. . How do I automate it through code so that both the models can be run in single script? 来源: https://stackoverflow.com/questions/55554746/how-to-restart-kernel-in-python-through-code

Finding contribution by each feature into making particular prediction by h2o ensemble model

て烟熏妆下的殇ゞ 提交于 2019-12-11 16:41:54
问题 I am trying to explain the decision taken by h2o GBM model. based on idea:https://medium.com/applied-data-science/new-r-package-the-xgboost-explainer-51dd7d1aa211 I want to calculate the contribution by each feature into making a certain decision at test time. Is it possible to get each individual tree from the ensable along with the log-odds at every node? also be needing the path traverse for each tree by model while making the prediction. 回答1: H2O doesn't have an equivalent