classification | 易学教程

Combining Weak Learners into a Strong Classifier

阅读更多关于 Combining Weak Learners into a Strong Classifier

问题 How do I combine few weak learners into a strong classifier? I know the formula, but the problem is that in every paper about AdaBoost that I've read there are only formulas without any example. I mean - I got weak learners and their weights, so I can do what the formula tells me to do (multiply learner by its weight and add another one multiplied by its weight and another one etc.) but how exactly do I do that? My weak learners are decision stumps. They got attribute and treshold, so what do

How best to deal with “None of the above” in Image Classification?

阅读更多关于 How best to deal with “None of the above” in Image Classification?

问题 This seems to be a fundamental question which some of you out there must have an opinion on. I have an image classifier implemented in CNTK with 48 classes. If the image does not match any of the 48 classes very well, then I'd like to be able to conclude that it was not among these 48 image types. My original idea was simply that if the highest output of the final Softmax layer was low, I would be able to conclude that the test image matched none well. While I occasionally see this occur, in

Modeling a very big data set (1.8 Million rows x 270 Columns) in R

阅读更多关于 Modeling a very big data set (1.8 Million rows x 270 Columns) in R

问题 I am working on a Windows 8 OS with a RAM of 8 GB . I have a data.frame of 1.8 million rows x 270 columns on which I have to perform a glm. (logit/any other classification) I've tried using ff and bigglm packages for handling the data. But I am still facing a problem with the error " Error: cannot allocate vector of size 81.5 Gb ". So, I decreased the number of rows to 10 and tried the steps for bigglm on an object of class ffdf. However the error still is persisting. Can any one suggest me

Why Gaussian radial basis function maps the examples into an infinite-dimensional space?

阅读更多关于 Why Gaussian radial basis function maps the examples into an infinite-dimensional space?

问题 I've just run through the Wikipedia page about SVMs, and this line caught my eyes: "If the kernel used is a Gaussian radial basis function, the corresponding feature space is a Hilbert space of infinite dimensions." http://en.wikipedia.org/wiki/Support_vector_machine#Nonlinear_classification In my understanding, if I apply Gaussian kernel in SVM, the resulting feature space will be m -dimensional (where m is the number of training samples), as you choose your landmarks to be your training

OpenCV and Latent SVM Detector

阅读更多关于 OpenCV and Latent SVM Detector

问题 I was wondering if anyone has managed to use the OpenCV implementation of Latent SVM Detector (http://docs.opencv.org/modules/objdetect/doc/latent_svm.html) successfully. There is a sample code that shows how to utilize the library but the problem is that the sample code uses a ready-made detector model that was generated using MatLab. Can some one guide me through the steps on how to generate my own detector model? 回答1: The MATLAB implementation of LatSVM by the authors of the paper has a

Build a custom svm kernel matrix with opencv

阅读更多关于 Build a custom svm kernel matrix with opencv

问题 I have to train a Support Vector Machine model and I'd like to use a custom kernel matrix, instead of the preset ones (like RBF, Poly, ecc.). How can I do that (if is it possible) with opencv's machine learning library? Thank you! 回答1: AFAICT, custom kernels for SVM aren't supported directly in OpenCV. It looks like LIBSVM, which is the underlying library that OpenCV uses for this, doesn't provide a particularly easy means of defining custom kernels. So, many of the wrappers that use LIBSVM

how to implement tensorflow's next_batch for own data

阅读更多关于 how to implement tensorflow's next_batch for own data

问题 In the tensorflow MNIST tutorial the mnist.train.next_batch(100) function comes very handy. I am now trying to implement a simple classification myself. I have my training data in a numpy array. How could I implement a similar function for my own data to give me the next batch? sess = tf.InteractiveSession() tf.global_variables_initializer().run() Xtr, Ytr = loadData() for it in range(1000): batch_x = Xtr.next_batch(100) batch_y = Ytr.next_batch(100) 回答1: The link you posted says: "we get a

One Class Classification Models in Spark

阅读更多关于 One Class Classification Models in Spark

Are there any implementations of One class classifiers in the Spark? There doesn't appear to be anything in ML or MLlib, but I was hoping that there was an extension developed by someone in the community that would provide some way of producing a trained classification model where only one labeled class is available in the training data. It's Java, not Spark, but LibSVM has a one class SVM classifer, and calling it from Spark shouldn't be a problem. 来源： https://stackoverflow.com/questions/42254637/one-class-classification-models-in-spark

Multivariate binary sequence prediction with CRF

阅读更多关于 Multivariate binary sequence prediction with CRF

问题 this question is an extension of this one which focuses on LSTM as opposed to CRF. Unfortunately, I do not have any experience with CRFs, which is why I'm asking these questions. Problem: I would like to predict a sequence of binary signal for multiple, non-independent groups. My dataset is moderately small (~1000 records per group), so I would like to try a CRF model here. Available data: I have a dataset with the following variables: Timestamps Group Binary signal representing activity

SKLearn: Getting distance of each point from decision boundary?

阅读更多关于 SKLearn: Getting distance of each point from decision boundary?

I am using SKLearn to run SVC on my data. from sklearn import svm svc = svm.SVC(kernel='linear', C=C).fit(X, y) I want to know how I can get the distance of each data point in X from the decision boundary? For linear kernel, the decision boundary is y = w * x + b, the distance from point x to the decision boundary is y/||w||. y = svc.decision_function(x) w_norm = np.linalg.norm(svc.coef_) dist = y / w_norm For non-linear kernels, there is no way to get the absolute distance. But you can still use the result of decision_funcion as relative distance. It happens to be that I am doing the homework