classification

Predict probabilities using SVM

眉间皱痕 提交于 2020-12-29 07:14:29
问题 I wrote this code and wanted to obtain probabilities of classification. from sklearn import svm X = [[0, 0], [10, 10],[20,30],[30,30],[40, 30], [80,60], [80,50]] y = [0, 1, 2, 3, 4, 5, 6] clf = svm.SVC() clf.probability=True clf.fit(X, y) prob = clf.predict_proba([[10, 10]]) print prob I obtained this output: [[0.15376986 0.07691205 0.15388546 0.15389275 0.15386348 0.15383004 0.15384636]] which is very weird because the probability should have been [0 1 0 0 0 0 0 0] (Observe that the sample

Predict probabilities using SVM

依然范特西╮ 提交于 2020-12-29 07:14:28
问题 I wrote this code and wanted to obtain probabilities of classification. from sklearn import svm X = [[0, 0], [10, 10],[20,30],[30,30],[40, 30], [80,60], [80,50]] y = [0, 1, 2, 3, 4, 5, 6] clf = svm.SVC() clf.probability=True clf.fit(X, y) prob = clf.predict_proba([[10, 10]]) print prob I obtained this output: [[0.15376986 0.07691205 0.15388546 0.15389275 0.15386348 0.15383004 0.15384636]] which is very weird because the probability should have been [0 1 0 0 0 0 0 0] (Observe that the sample

Predict probabilities using SVM

一曲冷凌霜 提交于 2020-12-29 07:14:13
问题 I wrote this code and wanted to obtain probabilities of classification. from sklearn import svm X = [[0, 0], [10, 10],[20,30],[30,30],[40, 30], [80,60], [80,50]] y = [0, 1, 2, 3, 4, 5, 6] clf = svm.SVC() clf.probability=True clf.fit(X, y) prob = clf.predict_proba([[10, 10]]) print prob I obtained this output: [[0.15376986 0.07691205 0.15388546 0.15389275 0.15386348 0.15383004 0.15384636]] which is very weird because the probability should have been [0 1 0 0 0 0 0 0] (Observe that the sample

Predict probabilities using SVM

℡╲_俬逩灬. 提交于 2020-12-29 07:12:52
问题 I wrote this code and wanted to obtain probabilities of classification. from sklearn import svm X = [[0, 0], [10, 10],[20,30],[30,30],[40, 30], [80,60], [80,50]] y = [0, 1, 2, 3, 4, 5, 6] clf = svm.SVC() clf.probability=True clf.fit(X, y) prob = clf.predict_proba([[10, 10]]) print prob I obtained this output: [[0.15376986 0.07691205 0.15388546 0.15389275 0.15386348 0.15383004 0.15384636]] which is very weird because the probability should have been [0 1 0 0 0 0 0 0] (Observe that the sample

Adding gaussian noise to a dataset of floating points and save it (python)

百般思念 提交于 2020-12-29 03:02:39
问题 I'm working on classification problem where i need to add different levels of gaussian noise to my dataset and do classification experiments until my ML algorithms can't classify the dataset. unfortunately i have no idea how to do that. any advise or coding tips on how to add the gaussian noise? 回答1: You can follow these steps: Load the data into a pandas dataframe clean_signal = pd.read_csv("data_file_name") Use numpy to generate Gaussian noise with the same dimension as the dataset. Add

How can I generate a plot of positive predictive value (PPV) vs various cut-off points for classifications?

两盒软妹~` 提交于 2020-12-25 01:29:08
问题 I have generated some scores to help predict whether or not something is yes (1) or no (0), let's say the data consists of: scores = c(10:20) response = c(0,0,1,0,1,0,1,1,0,1,1) mydata = data.frame(scores, response) I can do an ROC analysis, which gives an AUC of .77: roc(response = mydata$response, predictor = mydata$scores) Now, how exactly do I see what happens when various cut-offs are chosen? I'd like to have cut-offs on the x-axis (let's say 13,14,15,16,17) and PPV on the y-axis. What's

Validation accuracy metrics reported by Keras model.fit log and Sklearn.metrics.confusion_matrix don't match each other

别来无恙 提交于 2020-12-15 06:29:10
问题 The problem is that the reported validation accuracy value I get from Keras model.fit history is significantly higher than the validation accuracy metric I get from sklearn.metrics functions. The results I get from model.fit are summarized below: Last Validation Accuracy: 0.81 Best Validation Accuracy: 0.84 The results (normalized) from sklearn are pretty different: True Negatives: 0.78 True Positives: 0.77 Validation Accuracy = (TP + TN) / (TP + TN + FP + FN) = 0.775 (see confusion matrix

Getting the maximum accuracy for a binary probabilistic classifier in scikit-learn

≡放荡痞女 提交于 2020-12-01 11:17:05
问题 Is there any built-in function to get the maximum accuracy for a binary probabilistic classifier in scikit-learn? E.g. to get the maximum F1-score I do: # AUCPR precision, recall, thresholds = sklearn.metrics.precision_recall_curve(y_true, y_score) auprc = sklearn.metrics.auc(recall, precision) max_f1 = 0 for r, p, t in zip(recall, precision, thresholds): if p + r == 0: continue if (2*p*r)/(p + r) > max_f1: max_f1 = (2*p*r)/(p + r) max_f1_threshold = t I could compute the maximum accuracy in