grid-search

“Parallel” pipeline to get best model using gridsearch

此生再无相见时 提交于 2019-12-01 14:39:51
In sklearn, a serial pipeline can be defined to get the best combination of hyperparameters for all consecutive parts of the pipeline. A serial pipeline can be implemented as follows: from sklearn.svm import SVC from sklearn import decomposition, datasets from sklearn.pipeline import Pipeline from sklearn.model_selection import GridSearchCV digits = datasets.load_digits() X_train = digits.data y_train = digits.target #Use Principal Component Analysis to reduce dimensionality # and improve generalization pca = decomposition.PCA() # Use a linear SVC svm = SVC() # Combine PCA and SVC to a

Fitting in nested cross-validation with cross_val_score with pipeline and GridSearch

房东的猫 提交于 2019-12-01 12:38:17
I am working in scikit and I am trying to tune my XGBoost. I made an attempt to use a nested cross-validation using the pipeline for the rescaling of the training folds (to avoid data leakage and overfitting) and in parallel with GridSearchCV for param tuning and cross_val_score to get the roc_auc score at the end. from imblearn.pipeline import Pipeline from sklearn.model_selection import RepeatedKFold from sklearn.model_selection import GridSearchCV from sklearn.model_selection import cross_val_score from xgboost import XGBClassifier std_scaling = StandardScaler() algo = XGBClassifier() steps

Keras Wrappers for Scikit Learn - AUC scorer is not working

你离开我真会死。 提交于 2019-12-01 04:51:47
问题 I'm trying to use Keras Scikit Learn Wrapper in order to make random search for parameters easier. I wrote an example code here where : I generate an artificial dataset: I am using moons from scikit learn from sklearn.datasets import make_moons dataset = make_moons(1000) Model builder definition: I define build_fn function needed: def build_fn(nr_of_layers = 2, first_layer_size = 10, layers_slope_coeff = 0.8, dropout = 0.5, activation = "relu", weight_l2 = 0.01, act_l2 = 0.01, input_dim = 2):

“Parallel” pipeline to get best model using gridsearch

ぃ、小莉子 提交于 2019-11-30 21:13:43
问题 In sklearn, a serial pipeline can be defined to get the best combination of hyperparameters for all consecutive parts of the pipeline. A serial pipeline can be implemented as follows: from sklearn.svm import SVC from sklearn import decomposition, datasets from sklearn.pipeline import Pipeline from sklearn.model_selection import GridSearchCV digits = datasets.load_digits() X_train = digits.data y_train = digits.target #Use Principal Component Analysis to reduce dimensionality # and improve

Fitting in nested cross-validation with cross_val_score with pipeline and GridSearch

你说的曾经没有我的故事 提交于 2019-11-30 20:06:13
问题 I am working in scikit and I am trying to tune my XGBoost. I made an attempt to use a nested cross-validation using the pipeline for the rescaling of the training folds (to avoid data leakage and overfitting) and in parallel with GridSearchCV for param tuning and cross_val_score to get the roc_auc score at the end. from imblearn.pipeline import Pipeline from sklearn.model_selection import RepeatedKFold from sklearn.model_selection import GridSearchCV from sklearn.model_selection import cross

Keras: Out of memory when doing hyper parameter grid search

╄→尐↘猪︶ㄣ 提交于 2019-11-30 13:01:35
问题 I'm running multiple nested loops to do hyper parameter grid search. Each nested loop runs through a list of hyper parameter values and inside the innermost loop, a Keras sequential model is built and evaluated each time using a generator. (I'm not doing any training, I'm just randomly initializing and then evaluating the model multiple times and then retrieving the average loss). My problem is that during this process, Keras seems to be filling up my GPU memory, so that I eventually get an

Keras: Out of memory when doing hyper parameter grid search

◇◆丶佛笑我妖孽 提交于 2019-11-30 06:55:42
I'm running multiple nested loops to do hyper parameter grid search. Each nested loop runs through a list of hyper parameter values and inside the innermost loop, a Keras sequential model is built and evaluated each time using a generator. (I'm not doing any training, I'm just randomly initializing and then evaluating the model multiple times and then retrieving the average loss). My problem is that during this process, Keras seems to be filling up my GPU memory, so that I eventually get an OOM error. Does anybody know how to solve this and free up the GPU memory each time after a model is

how to tune parameters of custom kernel function with pipeline in scikit-learn

▼魔方 西西 提交于 2019-11-29 03:39:55
currently I have successfully defined a custom kernel function(pre-computing the kernel matrix) using def function, and now I am using the GridSearchCV function to get the best parameters. so, in the custom kernel function, there is a total of 2 parameters which will be tuned (Namely gamm and sea_gamma in the example below), and also, for SVR model, the cost c parameter has to be tuned as well. But until now, I can just tune the cost c parameter using GridSearchCV -> please refer to the Part I: example below. I have searched for some similar solutions such as: Is it possible to tune parameters

Early stopping with Keras and sklearn GridSearchCV cross-validation

时光毁灭记忆、已成空白 提交于 2019-11-28 21:18:51
问题 I wish to implement early stopping with Keras and sklean's GridSearchCV . The working code example below is modified from How to Grid Search Hyperparameters for Deep Learning Models in Python With Keras. The data set may be downloaded from here. The modification adds the Keras EarlyStopping callback class to prevent over-fitting. For this to be effective it requires the monitor='val_acc' argument for monitoring validation accuracy. For val_acc to be available KerasClassifier requires the

Use sklearn's GridSearchCV with a pipeline, preprocessing just once

梦想的初衷 提交于 2019-11-28 18:48:48
I'm using scickit-learn to tune a model hyper-parameters. I'm using a pipeline to have chain the preprocessing with the estimator. A simple version of my problem would look like this: import numpy as np from sklearn.model_selection import GridSearchCV from sklearn.pipeline import make_pipeline from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression grid = GridSearchCV(make_pipeline(StandardScaler(), LogisticRegression()), param_grid={'logisticregression__C': [0.1, 10.]}, cv=2, refit=False) _ = grid.fit(X=np.random.rand(10, 3), y=np.random.randint(2,