Passing utf-16 string to a Windows function

久未见 提交于 2020-01-15 09:20:28

问题


I have a Windows dll called some.dll with the following function:

void some_func(TCHAR* input_string)
{
...
}

some_func expects a pointer to utf-16 encoded string.

Running this python code:

from ctypes import *

some_string = "disco duck"
param_to_some_func = c_wchar_p(some_string.encode('utf-16'))  #  here exception!

some_dll = ctypes.WinDLL(some.dll)
some_dll.some_func(param_to_some_func)

fails with exception "unicode string or integer address expected instead of bytes instance"

The documentation for ctypes and ctypes.wintypes is very thin, and I have not found a way to convert a python string to a Windows wide char and pass it to a function.


回答1:


According to [Python 3.Docs]: Built-in Types - Text Sequence Type - str (emphasis is mine):

Textual data in Python is handled with str objects, or strings. Strings are immutable sequences of Unicode code points.

On Win they are UTF16 encoded.

So, the correspondence between CTypes and Python (also visible by checking the differences between):

  • [Python 3.Docs]: ctypes - Fundamental data types
  • [Python 2.Docs]: ctypes - Fundamental data types
╔═══════════════╦══════════════╦══════════════╗
║    CTypes     ║   Python 3   ║   Python 2   ║
╠═══════════════╬══════════════╬══════════════╣
║   c_char_p    ║    bytes     ║     str      ║
║   c_wchar_p   ║     str      ║   unicode    ║
╚═══════════════╩══════════════╩══════════════╝


Example:

  • Python 3:

    >>> import sys
    >>> import ctypes as ct
    >>>
    >>> sys.version
    '3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)]'
    >>>
    >>> text_ascii = b"Dummy"
    >>> text_unicode = "Dummy"
    >>>
    >>> ct.c_char_p(text_ascii)
    c_char_p(2563882450144)
    >>>
    >>> ct.c_wchar_p(text_ascii)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: unicode string or integer address expected instead of bytes instance
    >>>
    >>> ct.c_char_p(text_unicode)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: bytes or integer address expected instead of str instance
    >>>
    >>> ct.c_wchar_p(text_unicode)
    c_wchar_p(2563878400656)
    
  • Python 2 (note that str <=> unicode conversions are performed automatically):

    >>> import sys
    >>> import ctypes as ct
    >>>
    >>> sys.version
    '2.7.17 (v2.7.17:c2f86d86e6, Oct 19 2019, 21:01:17) [MSC v.1500 64 bit (AMD64)]'
    >>>
    >>> text_ascii = "Dummy"
    >>> text_unicode = u"Dummy"
    >>>
    >>> ct.c_char_p(text_ascii)
    c_char_p('Dummy')
    >>>
    >>> ct.c_wchar_p(text_ascii)
    c_wchar_p(u'Dummy')
    >>>
    >>> ct.c_char_p(text_unicode)
    c_char_p('Dummy')
    >>>
    >>> ct.c_wchar_p(text_unicode)
    c_wchar_p(u'Dummy')
    

Back to your situaton:

>>> import ctypes as ct
>>>
>>> some_string = "disco duck"
>>>
>>> enc_utf16 = some_string.encode("utf16")
>>> enc_utf16
b'\xff\xfed\x00i\x00s\x00c\x00o\x00 \x00d\x00u\x00c\x00k\x00'
>>>
>>> type(some_string), type(enc_utf16)
(<class 'str'>, <class 'bytes'>)
>>>
>>> ct.c_wchar_p(some_string)  # This is the right way
c_wchar_p(2508534214928)
>>>
>>> ct.c_wchar_p(enc_utf16)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unicode string or integer address expected instead of bytes instance

As a side note, TCHAR varies (it's a typedef) on _UNICODE (not) being defined. Check [MS.Docs]: Generic-Text Mappings in tchar.h for more details. So, depending on the C code compilation flags, the Python code might also need adjustments.



来源:https://stackoverflow.com/questions/59520331/passing-utf-16-string-to-a-windows-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!