How to read one column texts with Google Cloud Vision API

问题

I have the next document image

When I try to convert the image to text, the result is the next:

Top Text

Ref: Rad: Dte: Ddo:

Ejecutivo 76520400300 Banco de Bogotá Luz Adriana

Botton Text

The problem is Google API recongnize it like two columns so, How can I config the Google API in order to obtain one column text?

My goal is obtain:

Top Text

Ref:Ejecutivo Rad: 76520400300 Dte: Banco de Bogotá Ddo:Luz Adriana

Botton Text

回答1:

Cloud Vision API doesn't have a specific request property to specify the format used to read or sort the file's data. Instead, I think that the available workaround is to use the BoundingPoly and Vertex response properties, that display the coordinates related to each word contained in the image, in order to process the vertices data within your code logic and define the text that need to be grouped by columns and rows. You can take a look on this link which includes some response examples that include these properties.

In case this feature doesn't cover your current needs, you can use the Send Feedback button, located at the lower left and upper right corners of the service public documentation, as well as take a look the Issue Tracker tool in order to raise a Vision API feature request and notify to Google about this desired functionality.

来源：https://stackoverflow.com/questions/53949316/how-to-read-one-column-texts-with-google-cloud-vision-api

标签

ocr

google-cloud-vision

text-recognition