问题
I am trying to scan a passport page using the phone's camera using OpenCV.
In the above image the contour marked in red is my ROI (will need a top view of that). Performing segmentation I can detect the MRZ area. And the pages should have a fixed aspect ratio. Is there a way to scale the green contour using the aspect ratio to approximate the red one? I have tried finding the corners of the green rect using approxPolyDP
, and then scaling that rect and finally doing a perspective warp to get the top view. The problem is that the perspective rotation is not accounted for while doing the rectangular scaling, so the final rect is often wrong.
Often I get an output as marked in the following image
Update: Adding a little more explanation
In regard to the 1st image (assuming the red rect will always have a constant aspect ratio),
- My goal: is to crop out the red marked portion and then get a top view
- My approach: detect the MRZ/green rect -> now assume the bottom edge of the green rect is the same as the red one (close enough) -> So I got the width and two corners of the rect -> calculate other two corners using the height/aspect ratio
- Problem: my above calculation doesn't output the red rect, instead it outputs the green rect in the 2nd image (may be because those quadrilaterals aren't rectangles, angle between edges aren't either 0 or 90 degrees)
回答1:
As far as I understand your main goal is to get the top view of the passport page when its photo is taken from arbitrary angle. Also as I understand your approach is the following:
- Find MRZ and its wrapping polygon
- Extend the MRZ polygon to the top - this would give you the page polygon
- Warp perspective to get the top view.
And the main obstacle currently is to extend the polygon.
Please correct me If understood the goal incorrectly.
Extending a polygon is quiet easy from mathematical perspective. Points on each side of the polygon form a side line. If you draw the line further you can put there a new point. Programmatically it may look like this
new_left_top_x = old_left_bottom_x + (old_left_top_x - old_left_bottom_x) * pass_height_to_MRZ_height_ratio
new_left_top_y = old_left_bottom_y + (old_left_top_y - old_left_bottom_y) * pass_height_to_MRZ_height_ratio
The same can be done for the right part. This approach would also work with rotations up to 45 degrees.
However I'm afraid this approach would not give accurate results. I would suggest to detect the passport page itself instead of MRZ. The reason is that the page itself is quiet noticeable object on the photo and can be easily found by findContours
function.
I wrote some code to illustrate the idea that detecting MRZ is not really necessary.
import os
import imutils
import numpy as np
import argparse
import cv2
# Thresholds
passport_page_aspect_ratio = 1.44
passport_page_coverage_ratio_threshold = 0.6
morph_size = (4, 4)
def pre_process_image(image):
# Let's get rid of color first
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Then apply Otsu threshold to reveal important areas
ret, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
# erode white areas to "disconnect" them
# and dilate back to restore their original shape
morph_struct = cv2.getStructuringElement(cv2.MORPH_RECT, morph_size)
thresh = cv2.erode(thresh, morph_struct, anchor=(-1, -1), iterations=1)
thresh = cv2.dilate(thresh, morph_struct, anchor=(-1, -1), iterations=1)
return thresh
def find_passport_page_polygon(image):
cnts = cv2.findContours(image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
for cnt in cnts:
# compute the aspect ratio and coverage ratio of the bounding box
# width to the width of the image
(x, y, w, h) = cv2.boundingRect(cnt)
ar = w / float(h)
cr_width = w / float(image.shape[1])
# check to see if the aspect ratio and coverage width are within thresholds
if ar > passport_page_aspect_ratio and cr_width > passport_page_coverage_ratio_threshold:
# approximate the contour with a polygon with 4 points
epsilon = 0.02 * cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, epsilon, True)
return approx
return None
def order_points(pts):
# initialize a list of coordinates that will be ordered in the order:
# top-left, top-right, bottom-right, bottom-left
rect = np.zeros((4, 2), dtype="float32")
pts = pts.reshape(4, 2)
# the top-left point will have the smallest sum, whereas
# the bottom-right point will have the largest sum
s = pts.sum(axis=1)
rect[0] = pts[np.argmin(s)]
rect[2] = pts[np.argmax(s)]
# now, compute the difference between the points, the
# top-right point will have the smallest difference,
# whereas the bottom-left will have the largest difference
diff = np.diff(pts, axis=1)
rect[1] = pts[np.argmin(diff)]
rect[3] = pts[np.argmax(diff)]
return rect
def get_passport_top_vew(image, pts):
rect = order_points(pts)
(tl, tr, br, bl) = rect
# compute the height of the new image, which will be the
# maximum distance between the top-right and bottom-right
# y-coordinates or the top-left and bottom-left y-coordinates
height_a = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
height_b = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
max_height = max(int(height_a), int(height_b))
# compute the width using standard passport page aspect ratio
max_width = int(max_height * passport_page_aspect_ratio)
# construct the set of destination points to obtain the top view, specifying points
# in the top-left, top-right, bottom-right, and bottom-left order
dst = np.array([
[0, 0],
[max_width - 1, 0],
[max_width - 1, max_height - 1],
[0, max_height - 1]], dtype="float32")
# compute the perspective transform matrix and apply it
M = cv2.getPerspectiveTransform(rect, dst)
warped = cv2.warpPerspective(image, M, (max_width, max_height))
return warped
if __name__ == "__main__":
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True, help="path to images directory")
args = vars(ap.parse_args())
in_file = args["image"]
filename_base = in_file.replace(os.path.splitext(in_file)[1], "")
img = cv2.imread(in_file)
pre_processed = pre_process_image(img)
# Visualizing pre-processed image
cv2.imwrite(filename_base + ".pre.png", pre_processed)
page_polygon = find_passport_page_polygon(pre_processed)
if page_polygon is not None:
# Visualizing found page polygon
vis = img.copy()
cv2.polylines(vis, [page_polygon], True, (0, 255, 0), 2)
cv2.imwrite(filename_base + ".bounds.png", vis)
# Visualizing the warped top view of the passport page
top_view_page = get_passport_top_vew(img, page_polygon)
cv2.imwrite(filename_base + ".top.png", top_view_page)
The results I got:
For better result it would be also good to compensate the camera aperture distortion.
来源:https://stackoverflow.com/questions/41398356/how-to-scale-a-contours-height-by-a-factor