grid-search

sklearn - How to retrieve PCA components and explained variance from inside a Pipeline passed to GridSearchCV

我们两清 提交于 2021-02-08 06:51:35
问题 I am using GridSearchCV with a pipeline as follows: grid = GridSearchCV( Pipeline([ ('reduce_dim', PCA()), ('classify', RandomForestClassifier(n_jobs = -1)) ]), param_grid=[ { 'reduce_dim__n_components': range(0.7,0.9,0.1), 'classify__n_estimators': range(10,50,5), 'classify__max_features': ['auto', 0.2], 'classify__min_samples_leaf': [40,50,60], 'classify__criterion': ['gini', 'entropy'] } ], cv=5, scoring='f1') grid.fit(X,y) How do I now retrieve PCA details like components and explained

sklearn - How to retrieve PCA components and explained variance from inside a Pipeline passed to GridSearchCV

空扰寡人 提交于 2021-02-08 06:48:50
问题 I am using GridSearchCV with a pipeline as follows: grid = GridSearchCV( Pipeline([ ('reduce_dim', PCA()), ('classify', RandomForestClassifier(n_jobs = -1)) ]), param_grid=[ { 'reduce_dim__n_components': range(0.7,0.9,0.1), 'classify__n_estimators': range(10,50,5), 'classify__max_features': ['auto', 0.2], 'classify__min_samples_leaf': [40,50,60], 'classify__criterion': ['gini', 'entropy'] } ], cv=5, scoring='f1') grid.fit(X,y) How do I now retrieve PCA details like components and explained

GridSearchCV Random Forest Regressor Tuning Best Params

雨燕双飞 提交于 2021-02-07 13:22:51
问题 I want to improve the parameters of this GridSearchCV for a Random Forest Regressor . def Grid_Search_CV_RFR(X_train, y_train): from sklearn.model_selection import GridSearchCV from sklearn.model_selection import ShuffleSplit from sklearn.ensemble import RandomForestRegressor estimator = RandomForestRegressor() param_grid = { "n_estimators" : [10,20,30], "max_features" : ["auto", "sqrt", "log2"], "min_samples_split" : [2,4,8], "bootstrap": [True, False], } grid = GridSearchCV(estimator, param

sklearn grid.fit(X,y) - error: “positional indexers are out-of-bounds” for X_train,y_train

人走茶凉 提交于 2021-02-07 10:11:26
问题 This is a question about scikit learn (version 0.17.0) in Python 2.7 along with Pandas 0.17.1. In order to split raw data (with no missing entries) using the approach detailed here, I have found that if the split data are used to proceed with a .fit() that there is an error that appears. Here is the code taken largely unchanged from the other stackoverflow question with renaming of variables. I have then instantiated a grid and tried to fit the split data with the aim of determining optimal

Scikit-learn: How do we define a distance metric's parameter for grid search

╄→гoц情女王★ 提交于 2021-02-07 09:45:48
问题 I have following code snippet that attempts to do a grid search in which one of the grid parameters are the distance metrics to be used for the KNN algorithm. The example below fails if I use "wminkowski", "seuclidean" or "mahalanobis" distances metrics. # Define the parameter values that should be searched k_range = range(1,31) weights = ['uniform' , 'distance'] algos = ['auto', 'ball_tree', 'kd_tree', 'brute'] leaf_sizes = range(10, 60, 10) metrics = ["euclidean", "manhattan", "chebyshev",

Combining Recursive Feature Elimination and Grid Search in scikit-learn

醉酒当歌 提交于 2021-02-07 07:09:14
问题 I am trying to combine recursive feature elimination and grid search in scikit-learn. As you can see from the code below (which works), I am able to get the best estimator from a grid search and then pass that estimator to RFECV. However, I would rather do the RFECV first, then the grid search. The problem is that when I pass the selector ​from RFECV to the grid search, it does not take it: ValueError: Invalid parameter bootstrap for estimator RFECV Is it possible to get the selector from

Combining Recursive Feature Elimination and Grid Search in scikit-learn

醉酒当歌 提交于 2021-02-07 07:07:37
问题 I am trying to combine recursive feature elimination and grid search in scikit-learn. As you can see from the code below (which works), I am able to get the best estimator from a grid search and then pass that estimator to RFECV. However, I would rather do the RFECV first, then the grid search. The problem is that when I pass the selector ​from RFECV to the grid search, it does not take it: ValueError: Invalid parameter bootstrap for estimator RFECV Is it possible to get the selector from

How to use expand.grid values to run various model hyperparameter combinations for ranger in R

我与影子孤独终老i 提交于 2021-01-29 08:32:02
问题 I've seen various posts on how to select the independent variables for a model by using expand.grid and then create a formula based on that selection. However, I prepare my input tables beforehand and store them in a list. library(ranger) data(iris) Input_list <- list(iris1 = iris, iris2 = iris) # let's assume these are different input tables I'm rather interested in trying all the possible hyperparameter combinations for a given algorithm (here: Random Forest using ranger ) for my list of

How to get all the models (one for each set of parameters) using GridSearchCV?

拜拜、爱过 提交于 2021-01-27 15:11:25
问题 From my understanding: best_estimator_ provides the estimator with highest score; best_score_ provides the score of the selected estimator; cv_results_ may be exploited to get the scores of all estimators. However, it is not clear to me how to get the estimators themselves. 回答1: As I see it, you cannot. But what you can do is taking the best parameter combination from best_params_ and fit the model again with those same parameters. Check out attributes of GridSearchCV 来源: https:/

Perform feature selection using pipeline and gridsearch

吃可爱长大的小学妹 提交于 2020-12-12 11:47:33
问题 As part of a research project, I want to select the best combination of preprocessing techniques and textual features that optimize the results of a text classification task. For this, I am using Python 3.6. There are a number of methods to combine features and algorithms, but I want to take full advantage of sklearn's pipelines and test all the different (valid) possibilities using grid search for the ultimate feature combo. My first step was to build a pipeline that looks like the following