Does anyone knows the meaning of output of image_to_data, image_to_osd methods of pytesseract?

后端 未结 2 1759
暖寄归人
暖寄归人 2021-01-22 10:39

I\'m trying to extract the data from image using pytesseract. This module has image_to_data, image_to_osd methods. These two m

2条回答
  •  误落风尘
    2021-01-22 11:25

    my_image.jpg

    For example, Test the my_image.jpg with image_to_data in the following code, we will get the results like the results.png.

    results.png

    • level = 1/2/3/4/5,the level of current item.

    • page_num: the page index of the current item. In most instances, a image only has one page.

    • block_num: the block item of the current item. when tesseract OCR Image, it will split the image into several blocks according the PSM parameters and some rules. The words in a line often in a block.

    • par_num: The paragraph index of the current item. It is the page analysis results. line_num: The line index of the current item. It is the page analysis results. word_num: The word index in one block.

    • line_num: The line index of the current item. It is the page analysis results.

    • word_num: The word index in one block.

    • left/top/width/height:the top-left coordinate and the width and height of the current word.

    • conf: the confidence of the current word, the range is -1~100.. The -1 means that there is no text here. The 100 is the highest value.

    • text: the word ocr results.

    The meaning of the results from image_to_osd:

    • Page number: the page index of the current item. In most instances, a image only has one page.

    • Orientation in degrees: the clockwise rotation angle of the text in the current image relative to its reading angle, the value range is [0, 270, 180, 90].

    • Rotate: Record the angle at which the text in the current image is to be converted into readable, relative to the clockwise rotation of the current image, the value range is [0, 270, 180, 90]. Complementary to the [Orientation in degrees] value.

    • Orientation confidence:Indicates the confidence of the current [Orientation in degrees] and [Rotate] detection values. The greater the confidence, the more credible the test result, but no explanation of its value range has been found so far.

    • Script: The encoding type of the text in the current picture.

    • Script confidence: The confidence of the text encoding type in the current image.

    from pytesseract import Output import pytesseract import cv2

    image = cv2.imread("my_image.jpg")
    
    #swap color channel ordering from BGR (OpenCV’s default) to RGB (compatible with Tesseract and pytesseract).
    # By default OpenCV stores images in BGR format and since pytesseract assumes RGB format,
    # we need to convert from BGR to RGB format/mode:
    rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
     
    pytesseract.pytesseract.tesseract_cmd = r'C:\mypath\tesseract.exe'
    custom_config = r'-c tessedit_char_whitelist=0123456789 --psm 6'
    results = pytesseract.image_to_data(rgb, output_type=Output.DICT,lang='eng',config=custom_config)
    print(results)
    

提交回复
热议问题