问题
By default, SKLearn uses a One vs One classification scheme when training SVM's in the multiclass case.
I'm a bit confused as to, when you call attributes such as svm.n_support_ or svm.support_vectors_, which support vectors you're getting? For instance, in the case of iris dataset, there are 3 classes, so there should be a total of 3*(3-1)/2 = 3 different SVM classifiers built. Of which classifier are you getting support vectors back?
回答1:
Update:
dual_coef_
is the key, giving you the coefficients of the support vectors in the decision function. "Each of the support vectors is used in n_class - 1 classifiers. The n_class - 1 entries in each row correspond to the dual coefficients for these classifiers." .Take a look at the very end of this section (1.4.1.1), the table clearly explains it http://scikit-learn.org/stable/modules/svm.html#multi-class-classification)
Implementation details are very confusing to me as well. Coefficients of the support vector in the decision function for multiclass are non-trivial.
But here is the rule of thumb I use whenever I want to go into detail of specific properties of chosen support vectors:
y[svm.support_]
outputs:
array([0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
This way you get to know (maybe for debugging purposes) which support vector corresponds to which class. And of course you can check support vectors:
X[svm.support_]
My intuition here is that, as its name indicates, you take subsets of samples of the involved categories. Let's say we have 3 categories A, B and C:
- A vs. B --> it gives you several support vectors from A and B (a,a,a,b,b,...)
- A vs. C --> same... a,a,a,c,c,c,c (maybe some 'a' are repeated from before)
- B vs. C --> idem
So the svm.support_vectors_
returns all the support vectors but how it uses then in the decision_function
is still tricky to me as I'm not sure if it could use for example support vectors from A vs. B when doing the pair A vs. C - and I couldn't find implementation details (http://scikit-learn.org/stable/modules/generated/sklearn.multiclass.OneVsOneClassifier.html#sklearn.multiclass.OneVsOneClassifier.decision_function)
回答2:
All support vectors of all 3 classifiers.
Look at svm.support_.shape
it is 45.
19+19+7 = 45. All adds up.
Also if you look at svm.support_vectors_.shape
it will be (45,4) - [n_SV, n_features]
. Again makes sense, because we have 45 support vectors and 4 features in iris data set.
来源:https://stackoverflow.com/questions/35022270/which-support-vectors-returned-in-multiclass-svm-sklearn