问题
Following is an image of page from old parish records. As you can see, the text is barely visible, this is due to use of ink diluted with little too much water... Still, if you try hard enough, you can actually see the letters. I would like to figure out a way to automatically fix such pages to make the text better visible/readable.
Now I have tried manually in IrfanView some basic effects, the best I got was using edge detection, but still it was from from readable. Now I am trying opencv in Python and with binary threshold I am achieving some results:
img = cv2.imread('parish_page.png',cv2.IMREAD_GRAYSCALE)
img = cv2.threshold(img, 240, 255, cv2.THRESH_BINARY)[1]
cv2.imwrite('processed.png',img)
However this seems to create lots of noise around, also it kind of destroyed right borders of the page. Is there a way to make it cleaner, and/or perhaps even more readable?
I'll be glad for any tips, thanks in advance.
回答1:
In Imagemagick, you could use local area thresholding. (OpenCV has something similar called adaptive thresholding.)
Input:
convert img.png -negate -lat 20x20+2% -negate result.png
Lower/raise the 2% to get more/less gain.
回答2:
Here's a potential approach
- Perform adaptive histogram equalization (CLAHE)
- Apply a sharpen filter using cv2.filter2D()
- Adaptive threshold
CLAHE
Now we apply a sharpen kernel using cv2.filter2D()
. You could try other filters.
[ 0 -1 0]
[-1 5 -1]
[ 0 -1 0]
Finally we perform adaptive thresholding
Other potential steps after this could be to perform morphological transformations to remove noise and further filter the image but since the particles are so tiny, even a (3x3)
kernel removes too much detail
import cv2
import numpy as np
image = cv2.imread('1.png', 0)
clahe = cv2.createCLAHE().apply(image)
sharpen_kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
sharpen = cv2.filter2D(clahe, -1, sharpen_kernel)
thresh = cv2.threshold(sharpen, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
cv2.imshow('clahe', clahe)
cv2.imwrite('clahe.png', clahe)
cv2.imshow('sharpen', sharpen)
cv2.imwrite('sharpen.png', sharpen)
cv2.imshow('thresh', thresh)
cv2.imwrite('thresh.png', thresh)
cv2.waitKey()
来源:https://stackoverflow.com/questions/57015156/improve-contrast-and-quality-of-barely-visible-old-text-written-with-diluted-ink