问题
I am trying to call this cpp function from python:
TESS_API BOOL TESS_CALL TessBaseAPIProcessPages(TessBaseAPI* handle, const char* filename,
const char* retry_config, int timeout_millisec, TessResultRenderer* renderer)
{
if (handle->ProcessPages(filename, retry_config, timeout_millisec, renderer))
return TRUE;
else
return FALSE;
}
The last parameter of this function is TessResultRenderer
. There is another cpp function for creating TessResultRenderer
TESS_API TessResultRenderer* TESS_CALL TessTextRendererCreate(const char* outputbase)
{
return new TessTextRenderer(outputbase);
}
Now while calling this from my python, I did the following:
outputbase = "stdout"
renderer = tesseract.TessTextRendererCreate(outputbase)
text_out = tesseract.TessBaseAPIProcessPages(api,
ctypes.create_string_buffer(path),
None, 0, renderer) //Segmentation fault (core dumped) error on this line
but I keep getting Segmentation fault
error.
My question is how can I called TessBaseAPIProcessPages
from Python?
Some more reference links into the codebase:
referer api
Implementation of processPages(...)
Edit
After trying the commented suggestions, I did the following but I get an error: item 1 in _argtypes_ has no from_param method
PTessResultRenderer = ctypes.POINTER(TessResultRenderer)
self.tesseract.TessTextRendererCreate.restype = PTessResultRenderer
outputbase = "stdout"
self.tesseract.TessTextRendererCreate.argtypes = [outputbase] #error here
self.tesseract.TessTextRendererCreate
ReturnVal = ctypes.c_bool
self.tesseract.TessBaseAPIProcessPages.argtypes = [self.api, path, None, 0, PTessResultRenderer]
self.tesseract.TessBaseAPIProcessPages.restype = ReturnVal
self.tesseracto.TessBaseAPIProcessPages
class TessResultRenderer(ctypes.Structure):
pass
回答1:
There is an example of using the tesseract C-API from ctypes in the contrib folder. However it seems to be a little out of date. contrib/tesseract-c_api-demo.py
You need to set the restype
and argtypes
for a few methods. Also, don't forget to call the init function on the handler. The following example works for me. It reads the text from a file called "test.bmp" in English into the text
variable.
from ctypes import *
from ctypes.util import find_library
lang = b"eng"
filename = b"test.bmp"
TESSDATA_PREFIX = b"/usr/local/Cellar/tesseract/3.04.01_1/share/tessdata"
path = find_library("libtesseract.dylib")
tesseract = CDLL(path)
class TessBaseAPI(Structure):
pass
class TessResultRenderer(Structure):
pass
tesseract.TessBaseAPICreate.restype = POINTER(TessBaseAPI)
tesseract.TessBaseAPIInit3.argtypes = [POINTER(TessBaseAPI), c_char_p, c_char_p]
tesseract.TessBaseAPIInit3.restype = c_bool
tesseract.TessBaseAPIProcessPages.argtypes = [POINTER(TessBaseAPI), c_char_p, c_char_p, c_int, POINTER(TessResultRenderer)]
tesseract.TessBaseAPIProcessPages.restype = c_bool
tesseract.TessBaseAPIGetUTF8Text.argtypes = [POINTER(TessBaseAPI)]
tesseract.TessBaseAPIGetUTF8Text.restype = c_char_p
api = tesseract.TessBaseAPICreate()
rc = tesseract.TessBaseAPIInit3(api, TESSDATA_PREFIX, lang);
if (rc):
tesseract.TessBaseAPIDelete(api)
print("Could not initialize tesseract.\n")
exit(3)
success = tesseract.TessBaseAPIProcessPages(api, filename, None , 0, None)
if success:
text = tesseract.TessBaseAPIGetUTF8Text(api)
print("="*78)
print(text.decode("utf-8").strip())
print("="*78)
The output looks like this:
==============================================================================
This is a lot of 12 point text to test the
ocr code and see if it works on all types
of file format.
The quick brown dog jumped over the
lazy fox. The quick brown dog jumped
over the lazy fox. The quick brown dog
jumped over the lazy fox. The quick
brown dog jumped over the lazy fox.
==============================================================================
Edit: Replaced use of c_void_p
with opaque types as suggested by eryksun. Thanks!
回答2:
Segmentation faults occur when you run off of an array, or if you de-reference a null pointer. If you use a debugger, it will step you through all your code and show you exactly what is going on.
来源:https://stackoverflow.com/questions/36871072/segmentation-fault-while-calling-cpp-function-from-python