grid-search

Skip forbidden parameter combinations when using GridSearchCV

为君一笑 提交于 2020-01-21 04:20:24
问题 I want to greedily search the entire parameter space of my support vector classifier using GridSearchCV. However, some combinations of parameters are forbidden by LinearSVC and throw an exception. In particular, there are mutually exclusive combinations of the dual , penalty , and loss parameters: For example, this code: from sklearn import svm, datasets from sklearn.model_selection import GridSearchCV iris = datasets.load_iris() parameters = {'dual':[True, False], 'penalty' : ['l1', 'l2'], \

Python: Gridsearch Without Machine Learning?

可紊 提交于 2020-01-15 10:36:46
问题 I want to optimize an algorithm that has several variable parameters as input. For machine learning tasks, Sklearn offers the optimization of hyperparameters with the gridsearch functionality. Is there a standardized way / library in Python that allows the optimization of hyperparameters that is not limited to machine learning topics? 回答1: You can create a custom pipeline/estimator ( see link http://scikit-learn.org/dev/developers/contributing.html#rolling-your-own-estimator) with a score

Hyperparameter tuning in Keras (MLP) via RandomizedSearchCV

那年仲夏 提交于 2020-01-14 05:10:29
问题 I have been trying to tune a neural net for some time now but unfortunately, I cannot get a good performance out of it. I have a time-series dataset and I am using RandomizedSearchCV for binary classification. My code is below. Any suggestions or help will be appreciated. One thing is that I am still trying to figure out how to incorporate is early stopping. EDIT: Forgot to add that I am measuring the performance based on F1-macro metric and I cannot get a scoring higher that 0.68. Another

Use sklearn's GridSearchCV with a pipeline, preprocessing just once

二次信任 提交于 2020-01-09 12:54:08
问题 I'm using scickit-learn to tune a model hyper-parameters. I'm using a pipeline to have chain the preprocessing with the estimator. A simple version of my problem would look like this: import numpy as np from sklearn.model_selection import GridSearchCV from sklearn.pipeline import make_pipeline from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression grid = GridSearchCV(make_pipeline(StandardScaler(), LogisticRegression()), param_grid={

Use sklearn's GridSearchCV with a pipeline, preprocessing just once

。_饼干妹妹 提交于 2020-01-09 12:52:29
问题 I'm using scickit-learn to tune a model hyper-parameters. I'm using a pipeline to have chain the preprocessing with the estimator. A simple version of my problem would look like this: import numpy as np from sklearn.model_selection import GridSearchCV from sklearn.pipeline import make_pipeline from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression grid = GridSearchCV(make_pipeline(StandardScaler(), LogisticRegression()), param_grid={

Scorer function: difference between make_scorer/score_func and

六月ゝ 毕业季﹏ 提交于 2020-01-05 09:25:34
问题 In scikit's (0.18.1) documentation I find what follows a bit confusing. Seems that writing your own scoring function is doable in multiple ways. But what's the difference? GridSearchCV takes a scoring argument as a: scorer callable object / function with signature scorer(estimator, X, y) This option is supported also in model evaluation docs. Conversely, make_scorer wants a score_func as a: score function (or loss function) with signature score_func(y, y_pred, **kwargs) Example Both

What is _passthrough_scorer and How Can I Change Scorers in GridsearchCV (sklearn)?

为君一笑 提交于 2020-01-04 07:28:43
问题 http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.GridSearchCV.html (for reference) x = [[2], [1], [3], [1] ... ] # about 1000 data grid = GridSearchCV(KernelDensity(), {'bandwidth': np.linspace(0.1, 1.0, 10)}, cv=10) grid.fit(x) When I use GridSearchCV without specifying scoring function like the , the value of grid.scorer_ is . Could you explain what kind of function _passthrough_scorer is? In addition to this, I want to change the scoring function to mean_squared_error

Grid Search and Early Stopping Using Cross Validation with XGBoost in SciKit-Learn

牧云@^-^@ 提交于 2019-12-31 21:43:10
问题 I am fairly new to sci-kit learn and have been trying to hyper-paramater tune XGBoost. My aim is to use early stopping and grid search to tune the model parameters and use early stopping to control the number of trees and avoid overfitting. As I am using cross validation for the grid search, I was hoping to also use cross-validation in the early stopping criteria. The code I have so far looks like this: import numpy as np import pandas as pd from sklearn import model_selection import xgboost

Sklearn How to Save a Model Created From a Pipeline and GridSearchCV Using Joblib or Pickle?

ぐ巨炮叔叔 提交于 2019-12-31 12:56:08
问题 After identifying the best parameters using a pipeline and GridSearchCV , how do I pickle / joblib this process to re-use later? I see how to do this when it's a single classifier... from sklearn.externals import joblib joblib.dump(clf, 'filename.pkl') But how do I save this overall pipeline with the best parameters after performing and completing a gridsearch ? I tried: joblib.dump(grid, 'output.pkl') - But that dumped every gridsearch attempt (many files) joblib.dump(pipeline, 'output.pkl')

Using Smote with Gridsearchcv in Scikit-learn

百般思念 提交于 2019-12-29 01:35:07
问题 I'm dealing with an imbalanced dataset and want to do a grid search to tune my model's parameters using scikit's gridsearchcv. To oversample the data, I want to use SMOTE, and I know I can include that as a stage of a pipeline and pass it to gridsearchcv. My concern is that I think smote will be applied to both train and validation folds, which is not what you are supposed to do. The validation set should not be oversampled. Am I right that the whole pipeline will be applied to both dataset