SKLearn: Getting distance of each point from decision boundary?

倾然丶 夕夏残阳落幕 提交于 2019-12-08 08:07:31

问题


I am using SKLearn to run SVC on my data.

from sklearn import svm

svc = svm.SVC(kernel='linear', C=C).fit(X, y)

I want to know how I can get the distance of each data point in X from the decision boundary?


回答1:


For linear kernel, the decision boundary is y = w * x + b, the distance from point x to the decision boundary is y/||w||.

y = svc.decision_function(x)
w_norm = np.linalg.norm(svc.coef_)
dist = y / w_norm

For non-linear kernels, there is no way to get the absolute distance. But you can still use the result of decision_funcion as relative distance.




回答2:


It happens to be that I am doing the homework 1 of a course named Machine Learning Techniques. And there happens to be a problem about point's distance to hyperplane even for RBF kernel.

First we know that SVM is to find an "optimal" w for a hyperplane wx + b = 0.

And the fact is that

w = \sum_{i} \alpha_i \phi(x_i)

where those x are so called support vectors and those alpha are coefficient of them. Note that there is a phi() outside the x; it is the transform function that transform x to some high dimension space (for RBF, it is infinite dimension). And we know that

[\phi(x_1)\phi(x_2) = K(x_1, x_2)][2]

so we can compute

then we can get w. So, the distance you want should be

svc.decision_function(x) / w_norm

where w_norm the the norm calculated above.

(StackOverflow doesn't allow me post more than 2 links so render the latex yourself bah.)



来源:https://stackoverflow.com/questions/32074239/sklearn-getting-distance-of-each-point-from-decision-boundary

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!