classification

The pooled covariance matrix of TRAINING must be positive definite

主宰稳场 提交于 2020-01-03 06:01:25
问题 I know this question has already been asked a couple of times, but I couldn't find a solution to my problem. I don't have more variables than observations and I don't have NAN values in my matrix. Here's my function: function [ind, idx_ran] = fselect(features_f, class_f, dir) idx = linspace(1,size(features_f, 2), size(features_f, 2)); idx_ran = idx(:,randperm(size(features_f, 2))); features_t_ran = features_f(:,idx_ran); % randomize colums len = length(class_f); r = randi(len, [1, round(len*0

How to rank the instances based on prediction probability in sklearn

时光怂恿深爱的人放手 提交于 2020-01-03 01:51:08
问题 I am using sklearn's support vector machine ( SVC ) as follows to get the prediction probability of my instances in my dataset as follows using 10-fold cross validation . from sklearn import datasets iris = datasets.load_iris() X = iris.data y = iris.target clf=SVC(class_weight="balanced") proba = cross_val_predict(clf, X, y, cv=10, method='predict_proba') print(clf.classes_) print(proba[:,1]) print(np.argsort(proba[:,1])) My expected output is as follows for print(proba[:,1]) and print(np

Scikit learn Error Message 'Precision and F-score are ill-defined and being set to 0.0 in labels'

时光总嘲笑我的痴心妄想 提交于 2020-01-02 02:03:09
问题 Im working on a binary classification model, classifier is naive bayes. I have an almost balanced dataset however I get the following error message when I predict: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. 'precision', 'predicted', average, warn_for) I'm using gridsearch with CV k-fold 10. The test set and predictions contain both classes, so I don't understand the message. I'm working on the same dataset, train

Java: How can I assemble/create a single instance for classification using a Weka generated model?

故事扮演 提交于 2020-01-01 19:20:15
问题 I've been searching for an answer to this for a while to no avail. First a bit of background: I'm trying to create an AI for robocode using Weka. I'm first logging the required data from a manual robot to an ARFF file, this is working as it should. This data is then processed this using Weka and a model created, I'm then saving this file. I can successfully import the model and classify a dataset that has been imported from another arff file and use the results. What I want to do now is every

How to classify continuous audio

天涯浪子 提交于 2020-01-01 10:58:07
问题 I have a audio data set and each of them has different length. There are some events in these audios, that I want to train and test but these events are placed randomly, plus the lengths are different, it is really hard to build a machine learning system with using that dataset. I thought fixing a default size of length and build a multilayer NN however, the length's of events are also different. Then I thought about using CNN, like it is used to recognise patterns or multiple humans on an

How I classify a word of a text in things like names, number, money, date,etc?

人走茶凉 提交于 2020-01-01 07:30:52
问题 I did some questions about text-mining a week ago, but I was a bit confused and still, but now I know wgat I want to do. The situation: I have a lot of download pages with HTML content. Some of then can bean be a text from a blog, for example. They are not structured and came from different sites. What I want to do: I will split all the words with whitespace and I want to classify each one or a group of ones in some pre-defined itens like names, numbers, phone, email, url, date, money,

Vehicle segmentation and tracking

匆匆过客 提交于 2020-01-01 06:08:48
问题 I've been working on a project for some time, to detect and track (moving) vehicles in video captured from UAV's, currently I am using an SVM trained on bag-of-feature representations of local features extracted from vehicle and background images. I am then using a sliding window detection approach to try and localise vehicles in the images, which I would then like to track. The problem is that this approach is far to slow and my detector isn't as reliable as I would like so I'm getting quite

What is the equivalent for a Hidden Markov Model in the WEKA toolkit?

纵饮孤独 提交于 2020-01-01 05:48:08
问题 I need to classify a datastream which comes from a sensor network consisting of 8 accelerometers. Each accelerometer gives me a X Y and Z value. Thus at each sample i have 8 x 3 = 24 acceleration values. I sample at about 30 hz and the performance time is about 0.5 seconds. At first i thought of using a Hidden Markov model for this but it seems that the WEKA toolkit does not provide such a thing. What is the WEKA equivalent for this? Thank you. EDIT: how to format data? I have collected data

How can I calculate the point between two overlapping linear datasets?

穿精又带淫゛_ 提交于 2020-01-01 05:12:08
问题 I have two sets of data that overlap a bit (see plot below). I need to find the point between these sets where one would guess an unknown data point would belong in a particular category. If I have a new data point (let's say 5000 ), and had to bet $$$ on whether it belongs in Group A or Group B, how can I calculate the point that makes my bet most sure? See sample dataset and accompanying plot below with approximated point between these groups (calculated by eye). GROUP A [385,515,975,1136

Weka: How to get the probabilities of each class for the test instances

六月ゝ 毕业季﹏ 提交于 2020-01-01 03:43:07
问题 In the case of Weka's Explorer, is there any way to get the class probabilities of the test instances as classified by a Naive Bayes' classifier? 回答1: In Weka Explorer on the Classify tab, click on More options... and tick Output predictions . Then Start the training and testing and the result shows you the probabilities of assigning each class for each test instance. 来源: https://stackoverflow.com/questions/10868233/weka-how-to-get-the-probabilities-of-each-class-for-the-test-instances