svm

支持向量机(SVM)

浪尽此生 提交于 2019-12-27 13:34:11
支持向量机 SVM(Support Vector Machine) 作为一种可训练的机器学习方法 , 依靠小样本学习后的模型参数进行导航星提取 , 可以得到分布均匀且恒星数量大为减少的导航星表   基本情况  Vapnik 等人在多年研究统计学习理论基础上对线性 分类器 提出了另一种设计最佳准则。其原理也从线性可分说起,然后扩展到线性不可分的情况。甚至扩展到使用非线性函数中去,这种分类器被称为支持向量机 (Support Vector Machine, 简称 SVM) 。支持向量机的提出有很深的理论背景。 支持向量机方法是在近年来提出的一种新方法。    SVM 的主要思想可以概括为两点: (1) 它是针对线性可分情况进行分析,对于线性不可分的情况,通过使用非线性映射算法将低维输入空间线性不可分的样本转化为 高维 特征空间使其线性可分,从而 使得高维特征空间采用线性算法对样本的非线性特征进行线性分析成为可能; (2) 它基于结构风险最小化理论之上在特征空间中建构最优分割超平面,使得学习器得到全局最优化 , 并且在整个样本空间的期望风险以某个概率满足一定 上界 。 在学习这种方法时,首先要弄清楚这种方法考虑问题的特点,这就要从线性可分的最简单情况讨论起,在没有弄懂其原理之前,不要急于学习线性不可分等较复杂的情况,支持向量机在设计时,需要用到条件极值问题的求解,因此需用 拉格朗日

Stanford机器学习笔记-8. 支持向量机(SVMs)概述

老子叫甜甜 提交于 2019-12-26 18:22:40
8. Support Vector Machines(SVMs) Content      8. Support Vector Machines(SVMs)       8.1 Optimization Objection       8.2 Large margin intuition       8.3 Mathematics Behind Large Margin Classification       8.4 Kernels       8.5 Using a SVM         8.5.1 Multi-class Classification         8.5.2 Logistic Regression vs. SVMs 8.1 Optimization Objection 支持向量机 (Support Vector Machine: SVM)是一种非常有用的监督式机器学习算法。首先回顾一下Logistic回归,根据log()函数以及Sigmoid函数的性质,有: 同时,Logistic回归的代价函数(未正则化)如下: 为得到SVM的代价函数,我们作如下修改: 因此,对比Logistic的优化目标 SVM的优化目标如下: 注1:事实上,上述公式中的Cost0与Cost1函数是一种称为 hinge损失 的 替代损失(surrogate loss)函数

How to SVM Train my Edge images using Java code

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-25 18:35:58
问题 I have set of images on which I performed edge detection using OpenCV 3.1. The edges are stored in MAT of OpenCV. Can someone help me in processing for Java SVM train and test code on those set of images ? 回答1: Following discussion in comments I am providing you with an example project which I built for android studio a while back. This was used to classify images depending on Lab color spaces. //1.a Assign the parameters for SVM training here double nu = 0.999D; double gamma = 0.4D; double

SVM的点滴

核能气质少年 提交于 2019-12-25 12:56:18
SVM 1. 普通SVM的分类函数可表示为: 其中 ai 为待优化参数,物理意义即为支持向量样本权重, yi 用来表示训练样本属性,正样本或者负样本,为计算内积的核函数, b 为待优化参数。 其优化目标函数为: 其中 ||w|| 用来描述分界面到支持向量的宽度,越大,则分界面宽度越小。 C 用来描述惩罚因子,而 则是用来解决不可分问题而引入的松弛项。 在优化该类问题时,引入拉格朗日算子,该类优化问题变为: 其中待优化参数 ai 在数学意义上即为每个约束条件的拉格朗日系数。 而MKL则可认为是针对SVM的改进版,其分类函数可描述为: 其中, K k ( xi , x )表示第 K 个核函数, 则为对应的核函数权重。 其对应的优化函数可以描述为: 在优化该类问题时,会两次引入拉格朗日系数, ai 参数与之前相同,可以理解为样本权重,而 则可理解为核函数的权重,其数学意义即为对每个核函数引入的拉格朗日系数。具体的优化过程就不描述了,不然就成翻译论文啦~,大家感兴趣的可以看后面的参考文档。 通过对比可知,MKL的优化参数多了一层 其物理意义即为在该约束条件下每个核的权重。 Svm的分类函数形似上是类似于一个神经网络,输出由中间若干节点的线性组合构成,而多核学习的分类函数则类似于一个比svm更高一级的神经网络,其输出即为中间一层核函数的输出的线性组合。其示意图如下: 上图中

Decomposition of matrices for CPLEX and machine learning application

房东的猫 提交于 2019-12-25 09:19:20
问题 I am dealing with big matrices and time to time my code ends with 'killed:9' message in my terminal. I'm working on Mac OSx. A wise programmer tells me the problem in my code is liked to the stored matrix I am dealing with. nn = 35000 dd = 35 XX = np.random.rand(nn,dd) XX = XX.dot(XX.T) #it should be faster than np.dot(XX,XX.T) yy = np.random.rand(nn,1) XX = np.multiply(XX,yy.T) I have to store this huge matrix XX, my guess: I split the matrix with upp = np.triu(XX) Do I actually save space

My rows are mismatched in my SVM scripting code for Kaggle

杀马特。学长 韩版系。学妹 提交于 2019-12-25 09:06:06
问题 I am reviewing my e1071 code for SVM for the Kaggle Titanic data. Last I knew, this part of it was working, but now I'm getting a rather strange error. When I try to build my data.frame so I can submit to kaggle, it seems my prediction is the size of my training set instead of the test set. Problem Error in data.frame(PassengerId = test$passengerid, Survived = prediction) : arguments imply differing number of rows: 418, 714 Obviously, they should both be 418 and I do not understand what is

Libsvm : Wrong input format at line 1

混江龙づ霸主 提交于 2019-12-25 08:27:14
问题 I am trying to use Libsvm and I got the following behaviour: root@bcfd88c873fa:/home/libsvm# ./svm-train myfile Wrong input format at line 1 root@bcfd88c873fa:/home/libsvm# head -n 5 myfile 2 0:0.00000 8:0.00193 2:0.00000 1:0.00000 10:0.00722 3 6:0.00235 2:0.00000 0:0.00000 1:0.00000 5:0.00155 4 0:0.00000 1:0.00000 2:0.00000 4:0.00187 3 6:0.00121 8:0.00211 1:0.00000 2:0.00000 0:0.00000 3 0:0.00000 2:0.00000 1:0.00000 Can you see anything wrong on the format ? It works with other svm

Having different results every run with GMM Classifier

我与影子孤独终老i 提交于 2019-12-25 08:11:24
问题 I'm currently doing a speech recognition and machine learning related project. I have two classes now, and I create two GMM classifiers for each class, for labels 'happy' and 'sad' I want to train GMM classifiers with MFCC vectors. I am using two GMM classifiers for each label. (Previously it was GMM per file): But every time I run the script I am having different results. What might be the cause for that with same test and train samples? In the outputs below please note that I have 10 test

Python-Scikit. Training and testing data using SVM

依然范特西╮ 提交于 2019-12-25 06:55:35
问题 I am working on training and testing of data using SVM (scikit). I am training SVM and preparing a pickle from it. Then, I am using that pickle to test my system. First I am reading the training data and testing data in variables train_data and test_data respectively. After that, the code I am using for training is: vectorizer = TfidfVectorizer(max_df = 0.8, sublinear_tf=True, use_idf=True) train_vectors = vectorizer.fit_transform(train_data) test_vectors = vectorizer.transform(test_data)

Extract coefficients/weights from libsvm model file

◇◆丶佛笑我妖孽 提交于 2019-12-25 05:04:29
问题 I am using libsvm for creating a 2-classes classifier. I wish to extract the coefficient/weight of each feature used by the model generated by ./svm-train training.training model.model The model.model file looks like: svm_type c_svc kernel_type rbf gamma 8 nr_class 2 total_sv 442 rho 21 label 1 -1 nr_sv 188 254 SV 7080.357768871263 0:0 1:0.00643 2:0.01046 3:0.00963 4:0.02777 5:0.04338 19:0.04468 528.7111702760092 0:0 1:0.00058 3:0.00086 6:0.01158 7:0.0028 9:0.08991 13:0.0096 ... 391