grid-search

Use sklearn GridSearchCV on custom class whose fit method takes 3 arguments

十年热恋 提交于 2019-12-12 04:09:41
问题 I'm working on a project that involves implementing some algorithms as python classes and testing their performance. I decided to write them up as sklearn estimators so that I could use GridSearchCV for validation. However, one of my algorithms for Inductive Matrix Completion takes more than just X and y as arguments. This becomes a problem for the GridSearchCV.fit as there appears to be no way to pass more than just X and y to the fit method of the estimator. The source shows the following

FeatureUnion in scikit klearn and incompatible row dimension

﹥>﹥吖頭↗ 提交于 2019-12-12 03:39:02
问题 I have started to use scikit learn for text extraction. When I use standard function CountVectorizer and TfidfTransformer in a pipeline and when I try to combine with new features ( a concatention of matrix) I have got a row dimension problem. This is my pipeline: pipeline = Pipeline([('feats', FeatureUnion([ ('ngram_tfidf', Pipeline([('vect', CountVectorizer()),'tfidf', TfidfTransformer())])), ('addned', AddNed()),])), ('clf', SGDClassifier()),]) This is my class AddNEd which add 30 news

Optimizing two estimators (dependent on each other) using Sklearn Grid Search

无人久伴 提交于 2019-12-11 17:18:41
问题 The flow of my program is in two stages. I am using Sklearn ExtraTreesClassifier along with SelectFromModel method to select the most important features. Here it should be noted that the ExtraTreesClassifier takes many parameters as input like n_estimators etc for classification and eventually giving different set of important features for different values of n_estimators via SelectFromModel . This means that I can optimize the n_estimators to get the best features. In the second stage, I am

GridSearch for doc2vec model built using gensim

时间秒杀一切 提交于 2019-12-11 08:44:02
问题 I am trying to find best hyperparameters for my trained doc2vec gensim model which takes a document as an input and create its document embeddings. My train data consists of text documents but it doesn't have any labels. i.e. I just have 'X' but not 'y'. I found some questions here related to what I am trying to do but all of the solutions are proposed for supervised models but none for unsupervised like mine. Here is the code where I am training my doc2vec model: def train_doc2vec( self, X:

Grid search and KerasClassifier using class weights

不打扰是莪最后的温柔 提交于 2019-12-11 07:45:11
问题 I am trying to conduct grid search using scikit-learn RandomizedSearchCV function together with Keras KerasClassifier wrapper for my unbalanced multi-class classification problem. However, when I try to give class_weight as an input, the fit method gives me the following error: RuntimeError: Cannot clone object <keras.wrappers.scikit_learn.KerasClassifier object at 0x000002AA3C676710>, as the constructor either does not set or modifies parameter class_weight Below are the functions that I use

How to compare different metrics?

核能气质少年 提交于 2019-12-11 06:59:13
问题 I am using GridSearchCV to tune the hyperparameters. I also would like to compare different metrics with each other: def create_model(... model.add(Dense(,..) model.compile(..) return model model = KerasRegressor(build_fn=create_model, verbose=0) grid_obj = GridSearchCV (estimator=model , param_grid=hypparas , n_jobs=1 , cv = 3 , scoring = ['explained_variance', 'neg_mean_squared_error', 'r2'] , refit = 'neg_mean_squared_error' , return_train_score=True , verbose = 2 ) grid_result = grid_obj

GridSearchCV for number of neurons

╄→гoц情女王★ 提交于 2019-12-11 06:49:21
问题 I am trying to learn by myself how to grid-search number of neurons in a basic multi-layered neural networks. I am using GridSearchCV and KerasClasifier of Python as well as Keras. The code below works for other data sets very well but I could not make it work for Iris dataset for some reasons and I cannot find it why, I am missing out something here. The result I get is: Best: 0.000000 using {'n_neurons': 3} 0.000000 (0.000000) with: {'n_neurons': 3} 0.000000 (0.000000) with: {'n_neurons': 5

Iterating across multiple columns in Pandas DF and slicing dynamically

你。 提交于 2019-12-11 06:46:14
问题 TLDR: How to iterate across all options of multiple columns in a pandas dataframe without specifying the columns or their values explicitly? Long Version: I have a pandas dataframe that looks like this, only it has a lot more features or drug dose combinations than are listed here. Instead of just 3 types of features, it could have something like 70...: > dosage_df First Score Last Score A_dose B_dose C_dose 22 28 1 40 130 55 11 2 40 130 15 72 3 40 130 42 67 1 90 130 90 74 2 90 130 87 89 3 90

GridSearchCV - access to predicted values across tests?

ぐ巨炮叔叔 提交于 2019-12-11 00:24:07
问题 Is there a way to get access to the predicted values calculated within a GridSearchCV process? I'd like to be able to plot the predicted y values against their actual values (from the test/validation set). Once the grid search is complete, I can fit it against some other data using ypred = grid.predict(xv) but I'd like to be able to plot the values calculated during the grid search. Maybe there's a way of saving the points as a pandas dataframe? from sklearn.preprocessing import

Use of OneClassSVM with GridSearchCV

眉间皱痕 提交于 2019-12-10 19:49:40
问题 I am trying to perform a GridSearchCV function on OneClassSVM, but I can't seem to find right scoring method for OCSVM. From what i've gathered something like OneClassSVM.score does not exists thus is doesn't have a default scoring function needed in GridSearchCV. Unfortunately no scoring methods from the documentation doesn't work either because they are dedicated to supervised ML and OCSVM is a unsupervised method. Is there any way to perform GridSearch (or something similar to it, letting