iOS Tesseract OCR Image Preperation

后端 未结 2 506
一生所求
一生所求 2020-12-07 16:21

I would like to implement an OCR application that would recognize text from Photos.

I succeeded in Compiling and Integration the Tesseract Engine in iOS, I succeeded

2条回答
  •  旧巷少年郎
    2020-12-07 16:29

    I have used the code above but added two other function calls as well to convert the image so that it will work with the Tesseract.

    Firstly I used an image resize script to convert to 640 x 640 which seems to be more manageable for the Tesseract.

    -(UIImage *)resizeImage:(UIImage *)image {
    
        CGImageRef imageRef = [image CGImage];
        CGImageAlphaInfo alphaInfo = CGImageGetAlphaInfo(imageRef);
        CGColorSpaceRef colorSpaceInfo = CGColorSpaceCreateDeviceRGB();
    
        if (alphaInfo == kCGImageAlphaNone)
            alphaInfo = kCGImageAlphaNoneSkipLast;
    
        int width, height;
    
        width = 640;//[image size].width;
        height = 640;//[image size].height;
    
        CGContextRef bitmap;
    
        if (image.imageOrientation == UIImageOrientationUp | image.imageOrientation == UIImageOrientationDown) {
            bitmap = CGBitmapContextCreate(NULL, width, height, CGImageGetBitsPerComponent(imageRef), CGImageGetBytesPerRow(imageRef), colorSpaceInfo, alphaInfo);
    
        } else {
            bitmap = CGBitmapContextCreate(NULL, height, width, CGImageGetBitsPerComponent(imageRef), CGImageGetBytesPerRow(imageRef), colorSpaceInfo, alphaInfo);
    
        }
    
        if (image.imageOrientation == UIImageOrientationLeft) {
            NSLog(@"image orientation left");
            CGContextRotateCTM (bitmap, radians(90));
            CGContextTranslateCTM (bitmap, 0, -height);
    
        } else if (image.imageOrientation == UIImageOrientationRight) {
            NSLog(@"image orientation right");
            CGContextRotateCTM (bitmap, radians(-90));
            CGContextTranslateCTM (bitmap, -width, 0);
    
        } else if (image.imageOrientation == UIImageOrientationUp) {
            NSLog(@"image orientation up");
    
        } else if (image.imageOrientation == UIImageOrientationDown) {
            NSLog(@"image orientation down");
            CGContextTranslateCTM (bitmap, width,height);
            CGContextRotateCTM (bitmap, radians(-180.));
    
        }
    
        CGContextDrawImage(bitmap, CGRectMake(0, 0, width, height), imageRef);
        CGImageRef ref = CGBitmapContextCreateImage(bitmap);
        UIImage *result = [UIImage imageWithCGImage:ref];
    
        CGContextRelease(bitmap);
        CGImageRelease(ref);
    
        return result;
    }
    

    So that the radians work ensure you declare it above the @implementation

    static inline double radians (double degrees) {return degrees * M_PI/180;}
    

    Then I convert to grayscale.

    I found this article Convert image to grayscale on converting to grayscale.

    I have used the code from here successfully and can now read different colour text and different colour backgrounds

    I have modified the code slightly to work as a function within a class rather than as its own class which the other person did

    - (UIImage *) toGrayscale:(UIImage*)img
    {
        const int RED = 1;
        const int GREEN = 2;
        const int BLUE = 3;
    
        // Create image rectangle with current image width/height
        CGRect imageRect = CGRectMake(0, 0, img.size.width * img.scale, img.size.height * img.scale);
    
        int width = imageRect.size.width;
        int height = imageRect.size.height;
    
        // the pixels will be painted to this array
        uint32_t *pixels = (uint32_t *) malloc(width * height * sizeof(uint32_t));
    
        // clear the pixels so any transparency is preserved
        memset(pixels, 0, width * height * sizeof(uint32_t));
    
        CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
    
        // create a context with RGBA pixels
        CGContextRef context = CGBitmapContextCreate(pixels, width, height, 8, width * sizeof(uint32_t), colorSpace,
                                                     kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedLast);
    
        // paint the bitmap to our context which will fill in the pixels array
        CGContextDrawImage(context, CGRectMake(0, 0, width, height), [img CGImage]);
    
        for(int y = 0; y < height; y++) {
            for(int x = 0; x < width; x++) {
                uint8_t *rgbaPixel = (uint8_t *) &pixels[y * width + x];
    
                // convert to grayscale using recommended method:     http://en.wikipedia.org/wiki/Grayscale#Converting_color_to_grayscale
                uint32_t gray = 0.3 * rgbaPixel[RED] + 0.59 * rgbaPixel[GREEN] + 0.11 * rgbaPixel[BLUE];
    
                // set the pixels to gray
                rgbaPixel[RED] = gray;
                rgbaPixel[GREEN] = gray;
                rgbaPixel[BLUE] = gray;
            }
        }
    
        // create a new CGImageRef from our context with the modified pixels
        CGImageRef image = CGBitmapContextCreateImage(context);
    
        // we're done with the context, color space, and pixels
        CGContextRelease(context);
        CGColorSpaceRelease(colorSpace);
        free(pixels);
    
        // make a new UIImage to return
        UIImage *resultUIImage = [UIImage imageWithCGImage:image
                                                 scale:img.scale
                                           orientation:UIImageOrientationUp];
    
        // we're done with image now too
        CGImageRelease(image);
    
        return resultUIImage;
    }
    

提交回复
热议问题