Pass str as an int array to a Python C extended function (extended using SWIG)

后端 未结 1 1817
梦谈多话
梦谈多话 2020-12-12 05:49

How can I pass a str value (containing 3000 {\'0\', \'1\'} bytes) obtained using python code as an argument to a python c extended function (extended using SWIG

相关标签:
1条回答
  • 2020-12-12 05:55

    Regarding my comment, here are some more details about returning arrays from functions: [SO]: Returning an array using C. In short: ways handle this:

    1. Make the returned variable static
    2. Dynamically allocate it (using malloc (family) or new)
    3. Turn it into an additional argument for the function

    Getting that piece of C code to run within the Python interpreter is possible in 2 ways:

    • [Python 3.Docs]: Extending Python with C or C++ - which creates a C written Python module
      • A way of doing that is using swig which offers a simple interface for generating the module ([SWIG]: SWIG Basics) saving you the trouble of writing it yourself using [Python 3.Docs]: Python/C API Reference Manual
    • The other way around, leaving the code in a standard dll which can be accessed via [Python 3.Docs]: ctypes - A foreign function library for Python

    Since they both are doing the same thing, mixing them together makes no sense. So, pick the one that best fits your needs.


    1. ctypes

    • This is what you started with
    • It's one of the ways of doing things using ctypes

    ctypes_demo.c:

    #include <stdio.h>
    
    #if defined(_WIN32)
    #  define CTYPES_DEMO_EXPORT_API __declspec(dllexport)
    #else
    #  define CTYPES_DEMO_EXPORT_API
    #endif
    
    
    CTYPES_DEMO_EXPORT_API int exposekey(char *bitsIn, char *bitsOut) {
        int ret = 0;
        printf("Message from C code...\n");
        for (int j = 0; j < 1000; j++)
        {
            bitsOut[j] = bitsIn[j + 2000];
            ret++;
        }
        return ret;
    }
    

    Notes:

    • Based on comments, I changed the types in the function from int* to char*, because it's 4 times more compact (although it's still ~700% inefficient since 7 bits of each char are ignored versus only one of them being used; that can be fixed, but requires bitwise processing)
    • I took a and turned into the 2nd argument (bitsOut). I think this is best because it's caller responsibility to allocate and deallocate the array (the 3rd option from the beginning)
    • I also modified the index range (without changing functionality), because it makes more sense to work with low index values and add something to them in one place, instead of a high index values and subtract (the same) something in another place
    • The return value is the number of bits set (obviously, 1000 in this case) but it's just an example
    • printf it's just dummy, to show that the C code gets executed
    • When dealing with such arrays, it's recommended to pass their dimensions as well, to avoid out of bounds errors. Also, error handling is an important aspect

    test_ctypes.py:

    from ctypes import CDLL, c_char, c_char_p, c_int, create_string_buffer
    
    
    bits_string = "010011000110101110101110101010010111011101101010101"
    
    
    def main():
        dll = CDLL("./ctypes_demo.dll")
        exposekey = dll.exposekey
    
        exposekey.argtypes = [c_char_p, c_char_p]
        exposekey.restype = c_int
    
        bits_in = create_string_buffer(b"\0" * 2000 + bits_string.encode())
        bits_out = create_string_buffer(1000)
        print("Before: [{}]".format(bits_out.raw[:len(bits_string)].decode()))
        ret = exposekey(bits_in, bits_out)
        print("After: [{}]".format(bits_out.raw[:len(bits_string)].decode()))
        print("Return code: {}".format(ret))
    
    
    if __name__ == "__main__":
        main()
    

    Notes:

    • 1st, I want to mention that running your code didn't raise the error you got
    • Specifying function's argtypes and restype is mandatory, and also makes things easier (documented in the ctypes tutorial)
    • I am printing the bits_out array (only the first - and relevant - part, as the rest are 0) in order to prove that the C code did its job
    • I initialize bits_in array with 2000 dummy 0 at the beginning, as those values are not relevant here. Also, the input string (bits_string) is not 3000 characters long (for obvious reasons). If your bits_string is 3000 characters long you can simply initialize bits_in like: bits_in = create_string_buffer(bits_string.encode())
    • Do not forget to initialize bits_out to an array with a size large enough (in our example 1000) for its purpose, otherwise segfault might arise when trying to set its content past the size
    • For this (simple) function, the ctypes variant was easier (at least for me, since I don't use swig frequently), but for more complex functions / projects it will become an overkill and switching to swig would be the right thing to do

    Output (running with Python3.5 on Win):

    c:\Work\Dev\StackOverflow\q47276327>"c:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" test_ctypes.py
    Before: [                                                   ]
    Message from C code...
    After: [010011000110101110101110101010010111011101101010101]
    Return code: 1000
    


    2. swig

    • Almost everything from the ctypes section, applies here as well

    swig_demo.c:

    #include <malloc.h>
    #include <stdio.h>
    #include "swig_demo.h"
    
    
    char *exposekey(char *bitsIn) {
        char *bitsOut = (char*)malloc(sizeof(char) * 1000);
        printf("Message from C code...\n");
        for (int j = 0; j < 1000; j++) {
            bitsOut[j] = bitsIn[j + 2000];
        }
        return bitsOut;
    }
    

    swig_demo.i:

    %module swig_demo
    %{
    #include "swig_demo.h"
    %}
    
    %newobject exposekey;
    %include "swig_demo.h"
    

    swig_demo.h:

    char *exposekey(char *bitsIn);
    

    Notes:

    • Here I'm allocating the array and return it (the 2nd option from the beginning)
    • The .i file is a standard swig interface file
      • Defines the module, and its exports (via %include)
      • One thing that is worth mentioning is the %newobject directive that deallocates the pointer returned by exposekey to avoid memory leaks
    • The .h file just contains the function declaration, in order to be included by the .i file (it's not mandatory, but things are more elegant this way)
    • The rest is pretty much the same

    test_swig.py:

    from swig_demo import exposekey
    
    bits_in = "010011000110101110101110101010010111011101101010101"
    
    
    def main():
        bits_out = exposekey("\0" * 2000 + bits_in)
        print("C function returned: [{}]".format(bits_out))
    
    
    if __name__ == "__main__":
        main()
    

    Notes:

    • Things make much more sense from Python programmer's PoV
    • Code is a lot shorter (that is because swig did some "magic" behind the scenes):
      • The wrapper .c wrapper file generated from the .i file has ~120K
      • The swig_demo.py generated module has ~3K
    • I used the same technique with 2000 0 at the beginning of the string

    Output:

    c:\Work\Dev\StackOverflow\q47276327>"c:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" test_swig.py
    Message from C code...
    C function returned: [010011000110101110101110101010010111011101101010101]
    


    3. Plain Python C API

    • I added this part as a personal exercise
    • This is what swig does, but "manually"

    capi_demo.c:

    #include "Python.h"
    #include "swig_demo.h"
    
    #define MOD_NAME "capi_demo"
    
    
    static PyObject *PyExposekey(PyObject *self, PyObject *args) {
        PyObject *bitsInArg = NULL, *bitsOutArg = NULL;
        char *bitsIn = NULL, *bitsOut = NULL;
        if (!PyArg_ParseTuple(args, "O", &bitsInArg))
            return NULL;
        bitsIn = PyBytes_AS_STRING(PyUnicode_AsEncodedString(bitsInArg, "ascii", "strict"));
        bitsOut = exposekey(bitsIn);
        bitsOutArg = PyUnicode_FromString(bitsOut);
        free(bitsOut);
        return bitsOutArg;
    }
    
    
    static PyMethodDef moduleMethods[] = {
        {"exposekey", (PyCFunction)PyExposekey, METH_VARARGS, NULL},
        {NULL}
    };
    
    
    static struct PyModuleDef moduleDef = {
        PyModuleDef_HEAD_INIT, MOD_NAME, NULL, -1, moduleMethods
    };
    
    
    PyMODINIT_FUNC PyInit_capi_demo(void) {
        return PyModule_Create(&moduleDef);
    }
    

    Notes:

    • It requires swig_demo.h and swig_demo.c (not going to duplicate their contents here)
    • It only works with Python 3 (actually I got quite some headaches making it work, especially because I was used to PyString_AsString which is no longer present)
    • Error handling is poor
    • test_capi.py is similar to test_swig.py with one (obvious) difference: from swig_demo import exposekey should be replaced by from capi_demo import exposekey
    • The output is also the same to test_swig.py (again, not going to duplicate it here)
    0 讨论(0)
提交回复
热议问题