Increase space between text lines in image

*爱你&永不变心* 提交于 2021-02-09 11:11:59

问题


I have an input image of a paragraph of text in single line spacing. I'm trying to implement something like the line spacing option to increase/decrease space between text lines in Microsoft Word. The current image is in single space, how can I convert the text into double space? Or say .5 space? Essentially I'm trying to dynamically restructure the spacing between text lines, preferably with an adjustable parameter. Something like this:

Input image

Desired result

My current attempt looks like this. I've been able to increase the spacing slightly but the text detail seems to be eroded and there is random noise in between lines.

Any ideas on how to improve the code or any better approaches?

import numpy as np 
import cv2

img = cv2.imread('text.png')
H, W = img.shape[:2]
grey = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
threshed = cv2.threshold(grey, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]

hist = cv2.reduce(threshed, 1, cv2.REDUCE_AVG).reshape(-1)
spacing = 2
delimeter = [y for y in range(H - 1) if hist[y] <= spacing < hist[y + 1]]
arr = []
y_prev, y_curr = 0, 0
for y in delimeter:
    y_prev = y_curr
    y_curr = y
    arr.append(threshed[y_prev:y_curr, 0:W])

arr.append(threshed[y_curr:H, 0:W])
space_array = np.zeros((10, W))
result = np.zeros((1, W))

for im in arr:
    v = np.concatenate((space_array, im), axis=0)
    result = np.concatenate((result, v), axis=0)

result = (255 - result).astype(np.uint8)
cv2.imshow('result', result)
cv2.waitKey()

回答1:


Approach #1: Pixel analysis

  1. Obtain binary image. Load the image, convert to grayscale, and Otsu's threshold

  2. Sum row pixels. The idea is that the pixel sum of a row can be used to determine if it corresponds to text or white space

  3. Create new image and add additional white space. We iterate through the pixel array and add additional white space


Binary image

# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.png')
h, w = image.shape[:2]
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

Now we iterate through each row and sum the white pixels to generate a pixel array. We can profile a column of data generated from the sum of all the pixels in each row to determine which rows correspond to text. Sections of the data that equal 0 represents rows of the image that are composed of white space. Here's a visualization of the data array:

# Sum white pixels in each row
# Create blank space array and and final image 
pixels = np.sum(thresh, axis=1).tolist()
space = np.ones((2, w), dtype=np.uint8) * 255
result = np.zeros((1, w), dtype=np.uint8)

We convert the data to a list and iterate through the data to build the final image. If a row is determined to be white space then we concatenate an empty space array to the final image. By adjusting the size of the empty array, we can change the amount of space to add to the image.

# Iterate through each row and add space if entire row is empty
# otherwise add original section of image to final image
for index, value in enumerate(pixels):
    if value == 0:
        result = np.concatenate((result, space), axis=0)
    row = gray[index:index+1, 0:w]
    result = np.concatenate((result, row), axis=0)

Here's the result

Code

import cv2
import numpy as np 
import matplotlib.pyplot as plt
# import pandas as pd

# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.png')
h, w = image.shape[:2]
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Sum white pixels in each row
# Create blank space array and and final image 
pixels = np.sum(thresh, axis=1).tolist()
space = np.ones((1, w), dtype=np.uint8) * 255
result = np.zeros((0, w), dtype=np.uint8)

# Iterate through each row and add space if entire row is empty
# otherwise add original section of image to final image
for index, value in enumerate(pixels):
    if value == 0:
        result = np.concatenate((result, space), axis=0)
    row = gray[index:index+1, 0:w]
    result = np.concatenate((result, row), axis=0)

# Uncomment for plot visualization
'''
x = range(len(pixels))[::-1]
df = pd.DataFrame({'y': x, 'x': pixels})
df.plot(x='x', y='y', xlim=(-2000,max(pixels) + 2000), legend=None, color='teal')
'''
cv2.imshow('result', result)
cv2.imshow('thresh', thresh)
plt.show()
cv2.waitKey()

Approach #2: Individual line extraction

For a more dynamic approach, we can find the contours of each line and then add space in between each contour. We use the same method of appending extra white space as the 1st approach.

  1. Obtain binary image. Load image, grayscale, Gaussian blur, and Otsu's threshold

  2. Connect text contours. We create a horizontal shaped kernel and dilate to connect the words of each line into a single contour

  3. Extract each line contour. We find contours, sort from top-to-bottom using imtuils.contours.sort_contours() and extract each line ROI

  4. Append white space in between each line. We create a empty array and build the new image by appending white space between each line contour


Binary image

# Load image, grayscale, blur, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
invert = 255 - thresh  
height, width = image.shape[:2]

Create horizontal kernel and dilate

# Dilate with a horizontal kernel to connect text contours
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (10,2))
dilate = cv2.dilate(thresh, kernel, iterations=2)

Extracted individual line contour highlighted in green

# Extract each line contour
lines = []
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
(cnts, _) = contours.sort_contours(cnts, method="top-to-bottom")
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    cv2.rectangle(image, (0, y), (width, y+h), (36,255,12), 2)
    line = original[y:y+h, 0:width]
    line = cv2.cvtColor(line, cv2.COLOR_BGR2GRAY)
    lines.append(line)

Append white space in between each line. Here's the result with a 1 pixel wide space array

Result with a 5 pixel wide space array

# Append white space in between each line
space = np.ones((1, width), dtype=np.uint8) * 255
result = np.zeros((0, width), dtype=np.uint8)
result = np.concatenate((result, space), axis=0)
for line in lines:
    result = np.concatenate((result, line), axis=0)
    result = np.concatenate((result, space), axis=0)

Full code

import cv2
import numpy as np 
from imutils import contours

# Load image, grayscale, blur, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
invert = 255 - thresh  
height, width = image.shape[:2]

# Dilate with a horizontal kernel to connect text contours
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (10,2))
dilate = cv2.dilate(thresh, kernel, iterations=2)

# Extract each line contour
lines = []
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
(cnts, _) = contours.sort_contours(cnts, method="top-to-bottom")
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    cv2.rectangle(image, (0, y), (width, y+h), (36,255,12), 2)
    line = original[y:y+h, 0:width]
    line = cv2.cvtColor(line, cv2.COLOR_BGR2GRAY)
    lines.append(line)

# Append white space in between each line
space = np.ones((1, width), dtype=np.uint8) * 255
result = np.zeros((0, width), dtype=np.uint8)
result = np.concatenate((result, space), axis=0)
for line in lines:
    result = np.concatenate((result, line), axis=0)
    result = np.concatenate((result, space), axis=0)

cv2.imshow('result', result)
cv2.imshow('image', image)
cv2.imshow('dilate', dilate)
cv2.waitKey()


来源:https://stackoverflow.com/questions/59686943/increase-space-between-text-lines-in-image

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!