问题
Any idea on the recommended parameters for OpenCV SVM? I'm playing with the letter_recog.cpp in the OpenCV sample directory, however, the SVM accuracy is very poor! In one run I only got 62% accuracy:
$ ./letter_recog_modified -data /home/cobalt/opencv/samples/data/letter-recognition.data -save svm_letter_recog.xml -svm
The database /home/cobalt/opencv/samples/data/letter-recognition.data is loaded.
Training the classifier ...
data.size() = [16 x 20000]
responses.size() = [1 x 20000]
Recognition rate: train = 64.3%, test = 62.2%
The default parameters are:
model = SVM::create();
model->setType(SVM::C_SVC);
model->setKernel(SVM::LINEAR);
model->setC(1);
model->train(tdata);
Setting it to trainAuto() didn't help; it gave me a weird 0 % test accuracy:
model = SVM::create();
model->setType(SVM::C_SVC);
model->setKernel(SVM::LINEAR);
model->trainAuto(tdata);
Result:
Recognition rate: train = 0.0%, test = 0.0%
Update using yangjie's answer:
$ ./letter_recog_modified -data /home/cobalt/opencv/samples/data/letter-recognition.data -save svm_letter_recog.xml -svm
The database /home/cobalt/opencv/samples/data/letter-recognition.data is loaded.
Training the classifier ...
data.size() = [16 x 20000]
responses.size() = [1 x 20000]
Recognition rate: train = 58.8%, test = 57.5%
The result is no longer 0% but the accuracy is worse than the 62% earlier.
Using the RBF kernel with trainAuto() is worst?
$ ./letter_recog_modified_rbf -data /home/cobalt/opencv/samples/data/letter-recognition.data -save svm_letter_recog.xml -svm
The database /home/cobalt/opencv/samples/data/letter-recognition.data is loaded.
Training the classifier ...
data.size() = [16 x 20000]
responses.size() = [1 x 20000]
Recognition rate: train = 18.5%, test = 11.6%
Parameters:
model = SVM::create();
model->setType(SVM::C_SVC);
model->setKernel(SVM::RBF);
model->trainAuto(tdata);
回答1:
I debugged the sample code and found the reason.
The responses
is a Mat
of ASCII code of the letters.
However, the predicted labels returned from SVM trained by SVM::trainAuto
are ranging from 0-25, which correspond to the 26 classes. This can also be observed by looking at <class_labels>...</class_labels>
in the output file svm_letter_recog.xml
.
Therefore in test_and_save_classifier
, r = model->predict( sample )
and responses.at<int>(i)
are apparently not equal.
I also found that if we use SVM::train
, the class labels would be from 65-89 instead, which is why you can get normal result at first.
Solution
I am not sure whether it is a bug. But if you want to use SVM::trainAuto
in this sample now, you can change
test_and_save_classifier(model, data, responses, ntrain_samples, 0, filename_to_save);
in build_svm_classifier
to
test_and_save_classifier(model, data, responses, ntrain_samples, 'A', filename_to_save);
Update
trainAuto
and train
should have the same behavior in class_labels
. The problem is due to a bug fix before. So I have created a pull request to OpenCV to fix the problem.
回答2:
I suggest trying RBF kernel instead of linear. In many, many cases it is the best choice...
来源:https://stackoverflow.com/questions/31178095/recommended-values-for-opencv-svm-parameters