问题
My question is nearly identical to this question, except that the linked question deals with char*, whereas I'm using std::string in my code. Like the linked question, I'm also using C# as my target language.
I have a class written in C++:
class MyClass
{
public:
const std::string get_value() const; // returns utf8-string
void set_value(const std::string &value); // sets utf8-string
private:
// ...
};
And this get's wrapped by SWIG in C# as follows:
public class MyClass
{
public string get_value();
public void set_value(string value);
}
SWIG does everything for me, except that it doesn't make an utf8 to utf16 string conversion during the calls to MyClass. My strings come through fine if they are representable in ASCII, but if I try passing a string with non-ascii characters in a round-trip through "set_value" and "get_value", I end up with unintelligible characters.
How can I make SWIG wrap UTF-8 encoded C++ strings in C#? n.b. I'm using std::string, not std::wstring, and not char*.
There's a partial solution on the SWIG sourceforge site, but it deals with char* not std::string, and it uses a (configurable) fixed length buffer.
回答1:
With the help (read: genius!) of David Jeske in the linked Code Project article, I have finally been able to answer this question.
You'll need this class (from David Jeske's code) in your C# library.
public class UTF8Marshaler : ICustomMarshaler {
static UTF8Marshaler static_instance;
public IntPtr MarshalManagedToNative(object managedObj) {
if (managedObj == null)
return IntPtr.Zero;
if (!(managedObj is string))
throw new MarshalDirectiveException(
"UTF8Marshaler must be used on a string.");
// not null terminated
byte[] strbuf = Encoding.UTF8.GetBytes((string)managedObj);
IntPtr buffer = Marshal.AllocHGlobal(strbuf.Length + 1);
Marshal.Copy(strbuf, 0, buffer, strbuf.Length);
// write the terminating null
Marshal.WriteByte(buffer + strbuf.Length, 0);
return buffer;
}
public unsafe object MarshalNativeToManaged(IntPtr pNativeData) {
byte* walk = (byte*)pNativeData;
// find the end of the string
while (*walk != 0) {
walk++;
}
int length = (int)(walk - (byte*)pNativeData);
// should not be null terminated
byte[] strbuf = new byte[length];
// skip the trailing null
Marshal.Copy((IntPtr)pNativeData, strbuf, 0, length);
string data = Encoding.UTF8.GetString(strbuf);
return data;
}
public void CleanUpNativeData(IntPtr pNativeData) {
Marshal.FreeHGlobal(pNativeData);
}
public void CleanUpManagedData(object managedObj) {
}
public int GetNativeDataSize() {
return -1;
}
public static ICustomMarshaler GetInstance(string cookie) {
if (static_instance == null) {
return static_instance = new UTF8Marshaler();
}
return static_instance;
}
}
Then, in Swig's "std_string.i", on line 24 replace this line:
%typemap(imtype) string "string"
with this line:
%typemap(imtype, inattributes="[MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(UTF8Marshaler))]", outattributes="[return: MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(UTF8Marshaler))]") string "string"
and on line 61, replace this line:
%typemap(imtype) const string & "string"
with this line:
%typemap(imtype, inattributes="[MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(UTF8Marshaler))]", outattributes="[return: MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(UTF8Marshaler))]") string & "string"
Lo and behold, everything works. Read the linked article for a good understanding of how this works.
来源:https://stackoverflow.com/questions/19783363/how-to-wrap-utf-8-encoded-c-stdstrings-with-swig-in-c