问题
This is the image input.

Using python opencv. I did some pre-processing and found contours using
contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
then i did the following to save each character
img1 = cv2.imread("test26.png")
nu = 1
fin = "final"
for cnt in contours:
x,y,w,h = cv2.boundingRect(cnt)
img2 = img1[y:y+h, x:x+w]
img3 = Image.fromarray(img2)
filename = fin + str(nu) + ".png"
nu = nu + 1
img3.save(filename)
But characters are saved in a tree like order. I don't understand the order.
my intention is to get character by character and ocr it in order and save as text.
回答1:
You can try to find the location of letter by using the center of contours.
M = cv2.moments(contours)
cX = int(M["m10"] / M["m00"])
cY = int(M["m01"] / M["m00"])
Then you can find the order of characters with using cX and cY (If only one line, you use only cX)
回答2:
This code sorts the bounding boxes and achieves what was probably intended, does it?
import cv2
strFormula="1!((x+1)*(x+2))" # '!' means a character is not allowed in file name
img = cv2.imread("test26.png")
imgGray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
ret, imgThresh = cv2.threshold(imgGray, 127, 255, 0)
(major_ver, minor_ver, subminor_ver) = (cv2.__version__).split('.')
if int(major_ver) < 3 :
contours , hierarchy = cv2.findContours(imgThresh, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
else :
image, contours , _ = cv2.findContours(imgThresh, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
#:if
lstBoundingBoxes = []
for cnt in contours: lstBoundingBoxes.append(cv2.boundingRect(cnt))
lstBoundingBoxes.sort()
charNo=0
for item in lstBoundingBoxes[1:]: # skip first element ('bounding box' == entire image)
charNo += 1
fName = "charAtPosNo-" + str(charNo).zfill(2) + "_is_[ " + strFormula[charNo-1] + " ]"+ ".png";
x,y,w,h = item
cv2.imwrite(fName, img[y:y+h, x:x+w])
来源:https://stackoverflow.com/questions/42992601/trying-to-segment-characters-and-save-it-in-order-to-image-files-but-contours-a