Extracting text OpenCV

后端 未结 10 1954
臣服心动
臣服心动 2020-11-22 08:10

I am trying to find the bounding boxes of text in an image and am currently using this approach:

// calculate the local variances of the grayscale image
Mat          


        
10条回答
  •  执念已碎
    2020-11-22 08:35

    Here is an alternative approach that I used to detect the text blocks:

    1. Converted the image to grayscale
    2. Applied threshold (simple binary threshold, with a handpicked value of 150 as the threshold value)
    3. Applied dilation to thicken lines in image, leading to more compact objects and less white space fragments. Used a high value for number of iterations, so dilation is very heavy (13 iterations, also handpicked for optimal results).
    4. Identified contours of objects in resulted image using opencv findContours function.
    5. Drew a bounding box (rectangle) circumscribing each contoured object - each of them frames a block of text.
    6. Optionally discarded areas that are unlikely to be the object you are searching for (e.g. text blocks) given their size, as the algorithm above can also find intersecting or nested objects (like the entire top area for the first card) some of which could be uninteresting for your purposes.

    Below is the code written in python with pyopencv, it should easy to port to C++.

    import cv2
    
    image = cv2.imread("card.png")
    gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY) # grayscale
    _,thresh = cv2.threshold(gray,150,255,cv2.THRESH_BINARY_INV) # threshold
    kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
    dilated = cv2.dilate(thresh,kernel,iterations = 13) # dilate
    _, contours, hierarchy = cv2.findContours(dilated,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE) # get contours
    
    # for each contour found, draw a rectangle around it on original image
    for contour in contours:
        # get rectangle bounding contour
        [x,y,w,h] = cv2.boundingRect(contour)
    
        # discard areas that are too large
        if h>300 and w>300:
            continue
    
        # discard areas that are too small
        if h<40 or w<40:
            continue
    
        # draw rectangle around contour on original image
        cv2.rectangle(image,(x,y),(x+w,y+h),(255,0,255),2)
    
    # write original image with added contours to disk  
    cv2.imwrite("contoured.jpg", image) 
    

    The original image is the first image in your post.

    After preprocessing (grayscale, threshold and dilate - so after step 3) the image looked like this:

    Dilated image

    Below is the resulted image ("contoured.jpg" in the last line); the final bounding boxes for the objects in the image look like this:

    enter image description here

    You can see the text block on the left is detected as a separate block, delimited from its surroundings.

    Using the same script with the same parameters (except for thresholding type that was changed for the second image like described below), here are the results for the other 2 cards:

    enter image description here

    enter image description here

    Tuning the parameters

    The parameters (threshold value, dilation parameters) were optimized for this image and this task (finding text blocks) and can be adjusted, if needed, for other cards images or other types of objects to be found.

    For thresholding (step 2), I used a black threshold. For images where text is lighter than the background, such as the second image in your post, a white threshold should be used, so replace thesholding type with cv2.THRESH_BINARY). For the second image I also used a slightly higher value for the threshold (180). Varying the parameters for the threshold value and the number of iterations for dilation will result in different degrees of sensitivity in delimiting objects in the image.

    Finding other object types:

    For example, decreasing the dilation to 5 iterations in the first image gives us a more fine delimitation of objects in the image, roughly finding all words in the image (rather than text blocks):

    enter image description here

    Knowing the rough size of a word, here I discarded areas that were too small (below 20 pixels width or height) or too large (above 100 pixels width or height) to ignore objects that are unlikely to be words, to get the results in the above image.

提交回复
热议问题