I want to make a Java application to recognize characters by using libsvm but when get into this, I do not understand how could I train the image data to use with libsvm?
libsvm
has a specific data format, each line is a one training/testing vector in the form of
LABEL INDEX0:VALUE0 INDEX1:VALUE1 ... INDEXN:VALUEN
so in the most "naive" method, you simply convert matrix representation to the row representation by concatenating consequtive rows, so image like
010
011
000
would become
010011000
and in the libsvm format (assuming we label it with "5"):
5 0:0 1:1 2:0 3:0 4:1 5:1 6:0 7:0 8:0 9:0
as libsvm support "sparse" representation, you can ommit values with "0's"
5 1:1 4:1 5:1
This is a manual way, sample data is located here: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a1a
The easiest "automatic" way is to represent your data as a .csv format (again - convert data to the row-like format, then to the .csv), which is quite standard method:
LABEL,PIXEL_0,PIXEL_1,...,PIXEL_N
...
and then use this program for conversion
/* convert cvs data to libsvm/svm-light format */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char buf[10000000];
float feature[100000];
int main(int argc, char **argv)
{
FILE *fp;
if(argc!=2) { fprintf(stderr,"Usage %s filename\n",argv[0]); }
if((fp=fopen(argv[1],"r"))==NULL)
{
fprintf(stderr,"Can't open input file %s\n",argv[1]);
}
while(fscanf(fp,"%[^\n]\n",buf)==1)
{
int i=0,j;
char *p=strtok(buf,",");
feature[i++]=atof(p);
while((p=strtok(NULL,",")))
feature[i++]=atof(p);
// --i;
/*
if ((int) feature[i]==1)
printf("-1 ");
else
printf("+1 ");
*/
// printf("%f ", feature[1]);
printf("%d ", (int) feature[0]);
for(j=1;j<i;j++)
printf(" %d:%f",j,feature[j]);
printf("\n");
}
return 0;
}
Both training and testing files have exactly the same structure, simply split your data in some proportion (3:1 or 9:1) randomly into files training
and testing
, but remember to include balanced number of training vectors for each class in each file.
In particular - your data looks a bit like MNIST dataset, if it is a case, this is already prepared for libsvm:
http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html
MNIST training: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass/mnist.scale.bz2
MNIST testing : http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass/mnist.scale.t.bz2
If it possible with your data, converting your images to the real-valued ones in the [0,1] interval would be more valuable then binary data (which loses much information).
EDIT
As an example, if your image is a 8bit greyscale image, then each pixel is in fact a number v
between 0 and 255. What you are now doing, is some thresholding, setting 1 for v > T
and 0 for v <= T
, while mapping these values to real values would give more information to the model. It can be done by simple squashing v / 255
. As a result, all values are in the [0,1]
interval, but have also values "in between" like 0.25
etc.