Using Unicode strings in DllImport with a DLL written in Rust

扶醉桌前 提交于 2019-12-06 04:37:54
DK.

If you check the documentation on CharSet, you'll see that CharSet.Unicode tells .NET to marshal strings as UTF-16 (i.e. two bytes per code point). Thus, .NET is trying to pass printc what should be a *const u16, not a *const libc::c_char. When CStr goes to compute the length of the string, what it sees is the following:

b"y\0e\0y\0e\0y\0e\0"

That is, it sees one code unit, then a null byte, so it stops; hence why it says the length is "1".

Rust has no standard support for UTF-16 strings, but if you're working on Windows, there are some conversion methods: search the docs for OsStrExt and OsStringExt. Note that you must use the docs that installed with the compiler; the ones online won't include it.

Sadly, there's nothing for dealing directly with null-terminated UTF-16 strings. You'll need to write some unsafe code to turn a *const u16 into a &[u16] that you can pass to OsStringExt::from_wide.

Now, Rust does use Unicode, but it uses UTF-8. Sadly, there is no direct way to get .NET to marshal a string as UTF-8. Using any other encoding would appear to lose information, so you either have to explicitly deal with UTF-16 on the Rust side, or explicitly deal with UTF-8 on the C# side.

It's much simpler to re-encode the string as UTF-8 in C#. You can exploit the fact that .NET will marshal an array as a raw pointer to the first element (just like C) and pass a null-terminated UTF-8 string.

First, a static method for taking a .NET string and producing a UTF-8 string stored in a byte array:

byte[] NullTerminatedUTF8bytes(string str)
{
    return Encoding.GetBytes(str + "\0");
}

Then declare the signature of the Rust function like this:

[DllImport(dllname, CallingConvention = CallingConvention.Cdecl)]
static extern void printc([In] byte[] str);

Finally, call it like this:

printc(NullTerminatedUTF8bytes(str));

For bonus points, you can rework printc to instead take a *const u8 and a u32, passing the re-encoded string plus it's length; then you don't need the null terminator and can reconstruct the string using the std::slice::from_raw_parts function (but that's starting to go beyond the original question).

As for print2, that one is just unworkable. .NET knows nothing about Rust's String type, and it is in no way compatible with .NET strings. More than that, String doesn't even have a guaranteed layout, so binding to it safely is more or less not possible.

All that is a very long-winded way of saying: don't use String, or any other non-FFI-safe type, in cross-language functions, ever. If your intention here was to pass an "owned" string into Rust... I don't know if it's even possible to do in concert with .NET.

Aside: "FFI-safe" in Rust essentially boils down to: is either a built-in fixed-size type (i.e. not usize/isize), or is a user-defined type with #[repr(C)] attached to it. Sadly, the "FFI-safe"-ness of a type isn't included in the documentation.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!