classification

SVM classification with always high precision

限于喜欢 提交于 2020-01-25 13:27:20
问题 I have a binary classification problem and I'm trying to get precision-recall curve for my classifier. I use libsvm with RBF kernel and probability estimate option. To get the curve I'm changing decision threshold from 0 to 1 with steps of 0.1. But on every run, I get high precision even if recall decreases with increasing threshold. My false positive rate seems always low compared to true positives. My results are these: Threshold: 0.1 TOTAL TP:393, FP:1, FN: 49 Precision:0.997462, Recall: 0

Using mahout in java code, not cli

六眼飞鱼酱① 提交于 2020-01-25 02:20:14
问题 i want to be able to build a model using java, i am able to do so with CLI as folowing: ./mahout trainlogistic --input Candy-Crush.twtr.csv \ --output ./model \ --target hd_click --categories 2 \ --predictors click_frequency country_code ctr device_price_range hd_conversion time_of_day num_clicks phone_type twitter is_weekend app_entertainment app_wallpaper app_widgets arcade books_and_reference brain business cards casual comics communication education entertainment finance game_wallpaper

Leave one out crossvalind in Matlab

十年热恋 提交于 2020-01-25 01:27:09
问题 I have extracted HOG features for male and female pictures, now, I'm trying to use the Leave-one-out-method to classify my data. Due the standard way to write it in Matlab is: [Train, Test] = crossvalind('LeaveMOut', N, M); What I should write instead of N and M ? Also, should I write above code statement inside or outside a loop? this is my code, where I have training folder for Male (80 images) and female (80 images), and another one for testing (10 random images). for i = 1:10 [Train, Test

How to get all confusion matrix terminologies (TPR, FPR, TNR, FNR) for a multi class?

断了今生、忘了曾经 提交于 2020-01-24 14:55:29
问题 I have a code that can print the confusion matrix for a multiclass classification problem. import itertools import numpy as np import matplotlib.pyplot as plt from sklearn import svm, datasets from sklearn.model_selection import train_test_split from sklearn.metrics import confusion_matrix # import some data to play with iris = datasets.load_iris() X = iris.data y = iris.target class_names = iris.target_names # Split the data into a training set and a test set X_train, X_test, y_train, y_test

new shape and old shape must have the same number of elements

女生的网名这么多〃 提交于 2020-01-24 03:47:25
问题 For learning purpose, I am using Tensorflow.js, and I experience an error while trying to use the fit method with a batched dataset (10 by 10) to learn the process of batch training. I have got a few images 600x600x3 that I want to classify (2 outputs, either 1 or 0) Here is my training loop: const batches = await loadDataset() for (let i = 0; i < batches.length; i++) { const batch = batches[i] const xs = batch.xs.reshape([batch.size, 600, 600, 3]) const ys = tf.oneHot(batch.ys, 2) console

How to correct unstable loss and accuracy during training? (binary classification)

对着背影说爱祢 提交于 2020-01-23 17:10:30
问题 I am currently working on a small binary classification project using the new keras API in tensorflow. The problem is a simplified version of the Higgs Boson challenge posted on Kaggle.com a few years back. The dataset shape is 2000x14, where the first 13 elements of each row form the input vector, and the 14th element is the corresponding label. Here is a sample of said dataset: 86.043,52.881,61.231,95.475,0.273,77.169,-0.015,1.856,32.636,202.068, 2.432,-0.419,0.0,0 138.149,69.197,58.607,129

Why does binary accuracy give high accuracy while categorical accuracy give low accuracy, in a multi-class classification problem?

坚强是说给别人听的谎言 提交于 2020-01-23 01:35:27
问题 I'm working on a multiclass classification problem using Keras and I'm using binary accuracy and categorical accuracy as metrics. When I evaluate my model I get a really high value for the binary accuracy and quite a low one in for the categorical accuracy. I tried to recreate the binary accuracy metric in my own code but I am not having much luck. My understanding is that this is the process I need to recreate: def binary_accuracy(y_true, y_pred): return K.mean(K.equal(y_true, K.round(y_pred

Gamma distribution fit error

谁说我不能喝 提交于 2020-01-22 02:29:25
问题 For a classification task I want to fit a gamma distribution to two pair of data: Distance population within class and between class. This is to determine the theoretical False Accept and False Reject Rate. The fit Scipy returns puzzles me tough. A plot of the data is below, where circles denote within class distances and x-es between class distance, the solid line is the fitted gamma within class, the dotted line is the fitted gamma on the between class distance. What I would have expected

GBM R function: get variable importance separately for each class

白昼怎懂夜的黑 提交于 2020-01-20 16:48:06
问题 I am using the gbm function in R (gbm package) to fit stochastic gradient boosting models for multiclass classification. I am simply trying to obtain the importance of each predictor separately for each class, like in this picture from the Hastie book (the Elements of Statistical Learning) (p. 382). However, the function summary.gbm only returns the overall importance of the predictors (their importance averaged over all classes). Does anyone know how to get the relative importance values?

How to classify true negative from a video?

旧城冷巷雨未停 提交于 2020-01-17 18:03:04
问题 For a performance measuring purpose I am trying to draw ROC curve. In ROC curve I have to plot False Positive Rate (FPR) in x-axis and True Positive Rate (TPR) in y-axis. As we know, FPR = FP/(FP+TN) So in the following picture how can i detect True Negative(TN) ? I have used HOG classifier to detect human. I marked with rectangle 1,2,3,4,5,6(or should be 7) to show the human objects that should be ignored and not to classify as human. and I think those are True Negative. In this picture i