问题
I have a library which contains all font characters (Arial in my case). For example:

I'm using this library to OCR text from image.
The problem is that when you try to OCR such characters as "j", "/", "t" - characters could overlap one another! So OCR is now impossible, because characters do not match pattern images (up to 3 pixels are different).

How do I have to deal with this problem? Is there a better way to compare images? (C#, WinForms app)
I'm using this method for comparison:
unsafe public static bool CompareMemCmp(Bitmap b1, Bitmap b2)
{
if ((b1 == null) != (b2 == null)) return false;
if (b1.Size != b2.Size) return false;
var bd1 = b1.LockBits(new Rectangle(new System.Drawing.Point(0, 0), b1.Size), ImageLockMode.ReadOnly, PixelFormat.Format32bppArgb);
var bd2 = b2.LockBits(new Rectangle(new System.Drawing.Point(0, 0), b2.Size), ImageLockMode.ReadOnly, PixelFormat.Format32bppArgb);
try
{
IntPtr bd1scan0 = bd1.Scan0;
IntPtr bd2scan0 = bd2.Scan0;
int stride = bd1.Stride;
int len = stride * b1.Height;
return memcmp(bd1scan0, bd2scan0, len) == 0;
}
finally
{
b1.UnlockBits(bd1);
b2.UnlockBits(bd2);
}
}
It's extremely fast and reliable.. but you cant get a result if condition from above is met.. unfortunately.
回答1:
You could make these character pairs (there could be an unreasonable amount of them though..) "characters" ie. the "-j" combination would be recognized as "-j" character..
回答2:
You could return a score for each character. A kind of probability that the character is the character depicted.
You could make the score get higher if the center pixels match compared to edge pixels so you are able to make the guessing better.
来源:https://stackoverflow.com/questions/9945972/ocr-fails-due-to-font-specifics