Detect space between text (OpenCV, Python)

柔情痞子 提交于 2019-12-18 09:34:52

问题


I have the following code (which is in fact just 1 part of 4 needed to run all the project I am working on..):

#python classify.py --model models/svm.cpickle --image images/image.png

from __future__ import print_function
from sklearn.externals import joblib
from hog import HOG
import dataset
import argparse
import mahotas
import cv2

ap = argparse.ArgumentParser()
ap.add_argument("-m", "--model", required = True,
    help = "path to where the model will be stored")
ap.add_argument("-i", "--image", required = True,
    help = "path to the image file")
args = vars(ap.parse_args())

model = joblib.load(args["model"])

hog = HOG(orientations = 18, pixelsPerCell = (10, 10),
    cellsPerBlock = (1, 1), transform = True)

image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 30, 150)
(_, cnts, _) = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)

cnts = sorted([(c, cv2.boundingRect(c)[0]) for c in cnts], key =
    lambda x: x[1])

for (c, _) in cnts:
    (x, y, w, h) = cv2.boundingRect(c)

    if w >= 7 and h >= 20:
        roi = gray[y:y + h, x:x + w]
        thresh = roi.copy()
        T = mahotas.thresholding.otsu(roi)
        thresh[thresh > T] = 255
        thresh = cv2.bitwise_not(thresh)

        thresh = dataset.deskew(thresh, 20)
        thresh = dataset.center_extent(thresh, (20, 20))

        cv2.imshow("thresh", thresh)

        hist = hog.describe(thresh)
        digit = model.predict([hist])[0]
        print("I think that number is: {}".format(digit))

        cv2.rectangle(image, (x, y), (x + w, y + h),
        (0, 255, 0), 1)
        cv2.putText(image, str(digit), (x - 10, y - 10),
        cv2.FONT_HERSHEY_SIMPLEX, 1.2, (0, 255, 0), 2)
        cv2.imshow("image", image)
        cv2.waitKey(0)

This code is detecting and recognizing handwriten digits from images. Here is an example:

Let's say I don't care about the accuracy recognition.

My problem is the following: as you can see, the program take all the numbers he can see and print them in console. From console I can save them in a text file if I want BUT I can't tell the program that there is a space between the numbers.

What I want is that, if I print the numbers in a text file, they should be separated as in the image (sorry but it's a bit hard to explain..). The numbers should not be (even in console) printed all together but, where there is blank space, printed a blank area also.

Take a look at the firs image. After the first 10 digits, there is a blank space in image which there isn't in console.

Anyway, here is a link to full code. There are 4 .py files and 3 folders. To execute, open a CMD in the folder and paste the command python classify.py --model models/svm.cpickle --image images/image.png where image.png is the name of one file in images folder.

Full Code

Thanks in advance. In my opinion all this work would have to be done using neural networks but I want to try it first this way. I'm pretty new to this.


回答1:


This is a starter solution.

I don't have anything in Python for the time being but it shouldn't be hard to convert this plus the OpenCV function calls are similar and I've linked them below.


TLDR;

Find the centre of your boundingRects, then find the distance between them. If one rect is a certain threshold away, you may assume it as being a space.


First, find the centres of your bounding rectangles

vector<Point2f> centres;

for(size_t index = 0; index < contours.size(); ++index)
{
    Moments moment = moments(contours[index]);

    centres.push_back(Point2f(static_cast<float>(moment.m10/moment.m00), static_cast<float>(moment.m01/moment.m00)));
}

(Optional but recommended)

You can draw the centres to have a visual understanding of them.

for(size_t index = 0; index < centres.size(); ++index)
{
    Scalar colour = Scalar(255, 255, 0);
    circle(frame, circles[index], 2, colour, 2);
}

With this, just iterate through them confirming that the distance to the next one is within a reasonable threshold

for(size_t index = 0; index < centres.size(); ++index)
{
    // this is just a sample value. Tweak it around to see which value actually makes sense
    double distance = 0.5;
    Point2f current = centres[index];
    Point2f nextPoint = centres[index + 1];

    // norm calculates the euclidean distance between two points
    if(norm(nextPoint - current) >= distance)
    {
        // TODO: This is a potential space??
    }
}

You can read more about moments, norm and circle drawing calls in Python.

Happy coding, Cheers mate :)




回答2:


Used this code to do the job. It detects region of text/digits in images.

import cv2

image = cv2.imread("C:\\Users\\Bob\\Desktop\\PyHw\\images\\test5.png")
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY) # grayscale
_,thresh = cv2.threshold(gray,150,255,cv2.THRESH_BINARY_INV) # threshold
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
dilated = cv2.dilate(thresh,kernel,iterations = 13) # dilate
_, contours, hierarchy = cv2.findContours(dilated,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE) # get contours


idx =0
# for each contour found, draw a rectangle around it on original image
for contour in contours:

    idx += 1

    # get rectangle bounding contour
    [x,y,w,h] = cv2.boundingRect(contour)

    # discard areas that are too large
    if h>300 and w>300:
        continue

    # discard areas that are too small
    if h<40 or w<40:
        continue

    # draw rectangle around contour on original image
    #cv2.rectangle(image,(x,y),(x+w,y+h),(255,0,255),2)

    roi = image[y:y + h, x:x + w]

    cv2.imwrite('C:\\Users\\Bob\\Desktop\\' + str(idx) + '.jpg', roi)

    cv2.imshow('img',roi)
    cv2.waitKey(0)

The code is based on this other question/answer: Extracting text OpenCV



来源:https://stackoverflow.com/questions/46001090/detect-space-between-text-opencv-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!