Segmentation fault while calling cpp function from Python

若如初见. 提交于 2019-12-10 17:47:58

问题


I am trying to call this cpp function from python:

TESS_API BOOL TESS_CALL TessBaseAPIProcessPages(TessBaseAPI* handle, const char* filename, 
  const char* retry_config, int timeout_millisec, TessResultRenderer* renderer)
{
    if (handle->ProcessPages(filename, retry_config, timeout_millisec, renderer))
        return TRUE;
    else
        return FALSE;
}

The last parameter of this function is TessResultRenderer. There is another cpp function for creating TessResultRenderer

TESS_API TessResultRenderer* TESS_CALL TessTextRendererCreate(const char* outputbase)
{
    return new TessTextRenderer(outputbase);
}

Now while calling this from my python, I did the following:

outputbase = "stdout"
renderer = tesseract.TessTextRendererCreate(outputbase)
text_out = tesseract.TessBaseAPIProcessPages(api, 
     ctypes.create_string_buffer(path), 
     None, 0, renderer) //Segmentation fault (core dumped) error on this line

but I keep getting Segmentation fault error.

My question is how can I called TessBaseAPIProcessPages from Python?

Some more reference links into the codebase:

referer api

Implementation of processPages(...)

Edit

After trying the commented suggestions, I did the following but I get an error: item 1 in _argtypes_ has no from_param method

PTessResultRenderer = ctypes.POINTER(TessResultRenderer)
self.tesseract.TessTextRendererCreate.restype = PTessResultRenderer
outputbase = "stdout"
self.tesseract.TessTextRendererCreate.argtypes = [outputbase] #error here
self.tesseract.TessTextRendererCreate

ReturnVal = ctypes.c_bool
self.tesseract.TessBaseAPIProcessPages.argtypes = [self.api, path, None, 0, PTessResultRenderer]
self.tesseract.TessBaseAPIProcessPages.restype = ReturnVal
self.tesseracto.TessBaseAPIProcessPages

class TessResultRenderer(ctypes.Structure):
    pass

回答1:


There is an example of using the tesseract C-API from ctypes in the contrib folder. However it seems to be a little out of date. contrib/tesseract-c_api-demo.py

You need to set the restype and argtypes for a few methods. Also, don't forget to call the init function on the handler. The following example works for me. It reads the text from a file called "test.bmp" in English into the text variable.

from ctypes import *
from ctypes.util import find_library

lang = b"eng"
filename = b"test.bmp"
TESSDATA_PREFIX = b"/usr/local/Cellar/tesseract/3.04.01_1/share/tessdata"

path = find_library("libtesseract.dylib")
tesseract = CDLL(path)

class TessBaseAPI(Structure):
    pass
class TessResultRenderer(Structure):
    pass

tesseract.TessBaseAPICreate.restype = POINTER(TessBaseAPI)
tesseract.TessBaseAPIInit3.argtypes = [POINTER(TessBaseAPI), c_char_p, c_char_p]
tesseract.TessBaseAPIInit3.restype = c_bool
tesseract.TessBaseAPIProcessPages.argtypes = [POINTER(TessBaseAPI), c_char_p, c_char_p, c_int, POINTER(TessResultRenderer)]
tesseract.TessBaseAPIProcessPages.restype = c_bool
tesseract.TessBaseAPIGetUTF8Text.argtypes = [POINTER(TessBaseAPI)]
tesseract.TessBaseAPIGetUTF8Text.restype = c_char_p

api = tesseract.TessBaseAPICreate()
rc = tesseract.TessBaseAPIInit3(api, TESSDATA_PREFIX, lang);
if (rc):
    tesseract.TessBaseAPIDelete(api)
    print("Could not initialize tesseract.\n")
    exit(3)

success = tesseract.TessBaseAPIProcessPages(api, filename, None , 0, None)

if success:
    text = tesseract.TessBaseAPIGetUTF8Text(api)
    print("="*78)
    print(text.decode("utf-8").strip())
    print("="*78)

The output looks like this:

==============================================================================
This is a lot of 12 point text to test the
ocr code and see if it works on all types
of file format.

The quick brown dog jumped over the
lazy fox. The quick brown dog jumped
over the lazy fox. The quick brown dog
jumped over the lazy fox. The quick
brown dog jumped over the lazy fox.
==============================================================================

Edit: Replaced use of c_void_p with opaque types as suggested by eryksun. Thanks!




回答2:


Segmentation faults occur when you run off of an array, or if you de-reference a null pointer. If you use a debugger, it will step you through all your code and show you exactly what is going on.



来源:https://stackoverflow.com/questions/36871072/segmentation-fault-while-calling-cpp-function-from-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!