classification

Keras' fit_generator() for binary classification predictions always 50%

落花浮王杯 提交于 2021-02-10 17:30:00
问题 I have set up a model to train on classifying whether an image is a certain video game or not. I pre-scaled my images into 250x250 pixels and have them separated into two folders (the two binary classes) labelled 0 and 1 . The amount of both classes are within ~100 of each other and I have around 3500 images in total. Here are photos of the training process, the model set up and some predictions: https://imgur.com/a/CN1b6LV train_datagen = ImageDataGenerator( rescale=1. / 255, shear_range=0,

Keras' fit_generator() for binary classification predictions always 50%

梦想的初衷 提交于 2021-02-10 17:29:29
问题 I have set up a model to train on classifying whether an image is a certain video game or not. I pre-scaled my images into 250x250 pixels and have them separated into two folders (the two binary classes) labelled 0 and 1 . The amount of both classes are within ~100 of each other and I have around 3500 images in total. Here are photos of the training process, the model set up and some predictions: https://imgur.com/a/CN1b6LV train_datagen = ImageDataGenerator( rescale=1. / 255, shear_range=0,

Optimal Feature Selection Technique after PCA?

旧城冷巷雨未停 提交于 2021-02-10 14:51:50
问题 I'm implementing a classification task with binary outcome using RandomForestClassifier and I know the importance of data preprocessing to improve the accuracy score. In particular, my dataset contains more than 100 features and almost 4000 instances and I want to perform a dimensionality reduction technique in order to avoid overfitting since there is an high presence of noise in the data. For these tasks I usually use a classical Feature Selection method (filters, wrappers, feature

calculate Entropy for each class of the test set to measure uncertainty on pytorch

无人久伴 提交于 2021-02-10 05:11:13
问题 I am trying to calculate Entropy of each class of the dataset for an image classification task to measure uncertainty on pytorch,using the MC Dropout method and the solution proposed in this link Measuring uncertainty using MC Dropout on pytorch First,I have calculated the mean of each class per batch across different forward passes (class_mean_batch) and then for all the testloader (classes_mean) and then did some transformations to get (total_mean) to use it for calculating Entropy as shown

How to apply a ScikitLearn classifier to tiles/windows in a large image

社会主义新天地 提交于 2021-02-08 20:53:36
问题 Given is a trained classifer in scikit learn, e.g. a RandomForestClassifier . The classifier has been trained on samples of size e.g. 25x25. How can I easily apply this to all tiles/windows in a large image (e.g. 640x480)? What I could do is (slow code ahead!) x_train = np.arange(25*25*1000).reshape(25,25,1000) # just some pseudo training data y_train = np.arange(1000) # just some pseudo training labels clf = RandomForestClassifier() clf.train( ... ) #train the classifier img = np.arange(640

Gradient Boosting using gbm in R with distribution = “bernoulli”

折月煮酒 提交于 2021-02-08 09:26:21
问题 I am using gbm package in R and applying the 'bernoulli' option for distribution to build a classifier and i get unusual results of 'nan' and i'm unable to predict any classification results. But i do not encounter the same errors when i use 'adaboost'. Below is the sample code, i replicated the same errors with the iris dataset. ## using the iris data for gbm library(caret) library(gbm) data(iris) Data <- iris[1:100,-5] Label <- as.factor(c(rep(0,50), rep(1,50))) # Split the data into

How to balance images in folder by doing augmentation such that number of images in this folder are equal to number of images in other folder?

淺唱寂寞╮ 提交于 2021-02-08 09:23:54
问题 I have 5 folders named as class_i each folder has the i class images. the images are with .jpg format. How can I balance the images in each folder by doing augmentation such that number of images in this folder will be equal to the number of images in the folder with highest number of images? Also, could you please help in plotting a curve shows number of images in each folder before and after balancing? 回答1: Just extended my other answer with algorithm that does exactly what you want in this

How to use pytorch to construct multi-task DNN, e.g., for more than 100 tasks?

瘦欲@ 提交于 2021-02-08 06:25:15
问题 Below is the example code to use pytorch to construct DNN for two regression tasks. The forward function returns two outputs (x1, x2). How about the network for lots of regression/classification tasks? e.g., 100 or 1000 outputs. It definitely not a good idea to hardcode all the outputs (e.g., x1, x2, ..., x100). Is there an simple method to do that? Thank you. import torch from torch import nn import torch.nn.functional as F class mynet(nn.Module): def __init__(self): super(mynet, self)._

Multiclass ROC curves in R

限于喜欢 提交于 2021-02-08 04:37:26
问题 I'm new to the concept of ROC curves . I've tried to understand it by reading a few tutorials on the web. I found a really good example here in python which was helpful. I want to plot a ROC curve for multiclass classifier that I built(in Python). However, Most of the solutions on the web are for 2 class problems and not multiclass . However, I finally found "multiclass.roc" function in pROC package in R which does multiclass ROC curve plotting. The following is a simple example: library(pROC

ROC curves for multiclass classification in R

穿精又带淫゛_ 提交于 2021-02-07 08:13:20
问题 I have a dataset with 6 classes and I would like to plot a ROC curve for a multiclass classification. The first answer in this thread given by Achim Zeileis is a very good one. ROC curve in R using rpart package? But this works only for a binomial classification. And the error i get is Error in prediction, Number of classes is not equal to 2 . Any one who has done this for a multi-class classification? Here is a simple example of what I am trying to do. data <- read.csv("colors.csv") let's