k-fold

variable encoding in K-fold validation of random forest using package 'caret'

眉间皱痕 提交于 2021-01-29 07:50:46
问题 I want to run a RF classification just like it's specified in 'randomForest' but still use the k-fold repeated cross validation method (code below). How do I stop caret from creating dummy variables out of my categorical ones? I read that this may be due to One-Hot-Encoding, but not sure how to change this. I would be very greatful for some example lines on how to fix this! database: > str(river) 'data.frame': 121 obs. of 13 variables: $ stat_bino : Factor w/ 2 levels "0","1": 2 2 1 1 2 2 2 2

Why we should call split() function during passing StratifiedKFold() as a parameter of GridSearchCV?

醉酒当歌 提交于 2020-06-16 05:55:26
问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 12 days ago . What I am trying to do? I am trying to use StratifiedKFold() in GridSearchCV() . Then, what does confuse me? When we use K Fold Cross Validation, we just pass the number of CV inside GridSearchCV() like the following. grid_search_m = GridSearchCV(rdm_forest_clf, param_grid, cv=5, scoring='f1', return_train_score=True, n_jobs=2) Then, when I will need to use StratifiedKFold() ,

Sklearn Voting ensemble with models using different features and testing with k-fold cross validation

不问归期 提交于 2020-06-01 07:41:31
问题 I have a data frame with 4 different groups of features. I need to create 4 different models with these four different feature groups and combine them with the ensemble voting classifier. Furthermore, I need to test the classifier using k-fold cross validation. However, I am finding it difficult to combine different feature sets, voting classifier and k-fold cross validation with functionality available in sklearn. Following is the code that I have so far. y = df1.index x = preprocessing

Cross validation for MNIST dataset with pytorch and sklearn

◇◆丶佛笑我妖孽 提交于 2020-05-15 05:03:09
问题 I am new to pytorch and are trying to implement a feed forward neural network to classify the mnist data set. I have some problems when trying to use cross-validation. My data has the following shapes: x_train : torch.Size([45000, 784]) and y_train : torch.Size([45000]) I tried to use KFold from sklearn. kfold =KFold(n_splits=10) Here is the first part of my train method where I'm dividing the data into folds: for train_index, test_index in kfold.split(x_train, y_train): x_train_fold = x

Why am I getting “Supported target types are: ('binary', 'multiclass'). Got 'continuous' instead.” error?

做~自己de王妃 提交于 2020-01-23 18:20:07
问题 I am writing this code and keep getting the Supported target types are: ('binary', 'multiclass'). Got 'continuous' instead. error no matter what I try. Do you see the problem within my code? df = pd.read_csv('drain.csv') values = df.values seed = 7 numpy.random.seed(seed) X = df.iloc[:,:2] Y = df.iloc[:,2:] def create_model(): # create model model = Sequential() model.add(Dense(12, input_dim=8, activation='relu')) model.add(Dense(8, activation='relu')) model.add(Dense(1, activation='sigmoid')