grid-search | 易学教程

Use sklearn GridSearchCV on custom class whose fit method takes 3 arguments

阅读更多关于 Use sklearn GridSearchCV on custom class whose fit method takes 3 arguments

问题 I'm working on a project that involves implementing some algorithms as python classes and testing their performance. I decided to write them up as sklearn estimators so that I could use GridSearchCV for validation. However, one of my algorithms for Inductive Matrix Completion takes more than just X and y as arguments. This becomes a problem for the GridSearchCV.fit as there appears to be no way to pass more than just X and y to the fit method of the estimator. The source shows the following

FeatureUnion in scikit klearn and incompatible row dimension

阅读更多关于 FeatureUnion in scikit klearn and incompatible row dimension

问题 I have started to use scikit learn for text extraction. When I use standard function CountVectorizer and TfidfTransformer in a pipeline and when I try to combine with new features ( a concatention of matrix) I have got a row dimension problem. This is my pipeline: pipeline = Pipeline([('feats', FeatureUnion([ ('ngram_tfidf', Pipeline([('vect', CountVectorizer()),'tfidf', TfidfTransformer())])), ('addned', AddNed()),])), ('clf', SGDClassifier()),]) This is my class AddNEd which add 30 news

Optimizing two estimators (dependent on each other) using Sklearn Grid Search

阅读更多关于 Optimizing two estimators (dependent on each other) using Sklearn Grid Search

问题 The flow of my program is in two stages. I am using Sklearn ExtraTreesClassifier along with SelectFromModel method to select the most important features. Here it should be noted that the ExtraTreesClassifier takes many parameters as input like n_estimators etc for classification and eventually giving different set of important features for different values of n_estimators via SelectFromModel . This means that I can optimize the n_estimators to get the best features. In the second stage, I am

GridSearch for doc2vec model built using gensim

阅读更多关于 GridSearch for doc2vec model built using gensim

问题 I am trying to find best hyperparameters for my trained doc2vec gensim model which takes a document as an input and create its document embeddings. My train data consists of text documents but it doesn't have any labels. i.e. I just have 'X' but not 'y'. I found some questions here related to what I am trying to do but all of the solutions are proposed for supervised models but none for unsupervised like mine. Here is the code where I am training my doc2vec model: def train_doc2vec( self, X:

Grid search and KerasClassifier using class weights

阅读更多关于 Grid search and KerasClassifier using class weights

问题 I am trying to conduct grid search using scikit-learn RandomizedSearchCV function together with Keras KerasClassifier wrapper for my unbalanced multi-class classification problem. However, when I try to give class_weight as an input, the fit method gives me the following error: RuntimeError: Cannot clone object <keras.wrappers.scikit_learn.KerasClassifier object at 0x000002AA3C676710>, as the constructor either does not set or modifies parameter class_weight Below are the functions that I use

How to compare different metrics?

阅读更多关于 How to compare different metrics?

问题 I am using GridSearchCV to tune the hyperparameters. I also would like to compare different metrics with each other: def create_model(... model.add(Dense(,..) model.compile(..) return model model = KerasRegressor(build_fn=create_model, verbose=0) grid_obj = GridSearchCV (estimator=model , param_grid=hypparas , n_jobs=1 , cv = 3 , scoring = ['explained_variance', 'neg_mean_squared_error', 'r2'] , refit = 'neg_mean_squared_error' , return_train_score=True , verbose = 2 ) grid_result = grid_obj

GridSearchCV for number of neurons

阅读更多关于 GridSearchCV for number of neurons

问题 I am trying to learn by myself how to grid-search number of neurons in a basic multi-layered neural networks. I am using GridSearchCV and KerasClasifier of Python as well as Keras. The code below works for other data sets very well but I could not make it work for Iris dataset for some reasons and I cannot find it why, I am missing out something here. The result I get is: Best: 0.000000 using {'n_neurons': 3} 0.000000 (0.000000) with: {'n_neurons': 3} 0.000000 (0.000000) with: {'n_neurons': 5

Iterating across multiple columns in Pandas DF and slicing dynamically

阅读更多关于 Iterating across multiple columns in Pandas DF and slicing dynamically

问题 TLDR: How to iterate across all options of multiple columns in a pandas dataframe without specifying the columns or their values explicitly? Long Version: I have a pandas dataframe that looks like this, only it has a lot more features or drug dose combinations than are listed here. Instead of just 3 types of features, it could have something like 70...: > dosage_df First Score Last Score A_dose B_dose C_dose 22 28 1 40 130 55 11 2 40 130 15 72 3 40 130 42 67 1 90 130 90 74 2 90 130 87 89 3 90

GridSearchCV - access to predicted values across tests?

阅读更多关于 GridSearchCV - access to predicted values across tests?

问题 Is there a way to get access to the predicted values calculated within a GridSearchCV process? I'd like to be able to plot the predicted y values against their actual values (from the test/validation set). Once the grid search is complete, I can fit it against some other data using ypred = grid.predict(xv) but I'd like to be able to plot the values calculated during the grid search. Maybe there's a way of saving the points as a pandas dataframe? from sklearn.preprocessing import

Use of OneClassSVM with GridSearchCV

阅读更多关于 Use of OneClassSVM with GridSearchCV

问题 I am trying to perform a GridSearchCV function on OneClassSVM, but I can't seem to find right scoring method for OCSVM. From what i've gathered something like OneClassSVM.score does not exists thus is doesn't have a default scoring function needed in GridSearchCV. Unfortunately no scoring methods from the documentation doesn't work either because they are dedicated to supervised ML and OCSVM is a unsupervised method. Is there any way to perform GridSearch (or something similar to it, letting