C# marshal unmanaged pointer return type

问题

I have an unmanaged library which has a function like this:

type* foo();

foo basically allocates an instance of the unmanaged type on the managed heap through Marshal.AllocHGlobal.

I have a managed version of type. It's not blittable but I have MarshalAs attributes set on members so I can use Marshal.PtrToStructure to get a managed version of it. But having to wrap calls to foo with extra bookkeeping to call Marshal.PtrToStructure is a bit annoying.

I'd like to be able to do something like this on the C# side:

[DllImport("mylib", CallingConvention = CallingConvention.Cdecl)]
[return: MarshalAs(UnmanagedType.LPStruct)]
type* foo();

and have C#'s marshaller handle the conversion behind the scenes, like it does for function arguments. I thought I should be able to do this because type is allocated on the managed heap. But maybe I can't? Is there any way to have C#'s inbuilt marshaller handle the unmanaged-to-managed transition on the return type for me without having to manually call Marshal.PtrToStructure?

回答1:

A custom marshaler works fine if, on the .NET side, typeis declared as a class, not as a struct. This is clearly stated in UnmanagedType enumeration:

Specifies the custom marshaler class when used with the MarshalAsAttribute.MarshalType or MarshalAsAttribute.MarshalTypeRef field. The MarshalAsAttribute.MarshalCookie field can be used to pass additional information to the custom marshaler. You can use this member on any reference type.

Here is some sample code that should work fine

[[DllImport("mylib", CallingConvention = CallingConvention.Cdecl)]
[return : MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef= typeof(typeMarshaler))]
private static extern type Foo();

private class typeMarshaler : ICustomMarshaler
{
    public static readonly typeMarshaler Instance = new typeMarshaler();

    public static ICustomMarshaler GetInstance(string cookie) => Instance;

    public int GetNativeDataSize() => -1;

    public object MarshalNativeToManaged(IntPtr nativeData) => Marshal.PtrToStructure<type>(nativeData);

    // in this sample I suppose the native side uses GlobalAlloc (or LocalAlloc)
    // but you can use any allocation library provided you use the same on both sides
    public void CleanUpNativeData(IntPtr nativeData) => Marshal.FreeHGlobal(nativeData);

    public IntPtr MarshalManagedToNative(object managedObj) => throw new NotImplementedException();
    public void CleanUpManagedData(object managedObj) => throw new NotImplementedException();
}

[StructLayout(LayoutKind.Sequential)]
class type
{
    /* declare fields */
};

Of course, changing unmanaged struct declarations into classes can have deep implications (that may not always raise compile-time errors), especially if you have a lot of existing code.

Another solution is to use Roslyn to parse your code, extract all Foo-like methods and generate one additional .NET method for each. I would do this.

回答2:

type* foo()

This is very awkward function signature, hard to use correctly in a C or C++ program and that never gets better when you pinvoke. Memory management is the biggest problem, you want to work with the programmer that wrote this code to make it better.

Your preferred signature should resemble int foo(type* arg, size_t size). In other words, the caller supplies the memory and the native function fills it in. The size argument is required to avoid memory corruption, necessary when the version of type changes and gets larger. Often included as a field of type. The int return value is useful to return an error code so you can fail gracefully. Beyond making it safe, it is also much more efficient since no memory allocation is required at all. You can simply pass a local variable.

... allocates an instance of the unmanaged type on the managed heap through Marshal.AllocHGlobal

No, this is where memory management assumptions get very dangerous. Never the managed heap, native code has no decent way to call into the CLR. And you cannot assume that it used the equivalent of Marshal.AllocHGlobal(). The native code typically uses malloc() to allocate the storage, which heap is used to allocate from is an implementation detail of the CRT it links. Only that CRT's free() function is guaranteed to release it reliably. You cannot call free() yourself. Skip to the bottom to see why AllocHGlobal() appeared to be correct.

There are function signatures that forces the pinvoke marshaller to release the memory, it does so by calling Marshal.FreeCoTaskMem(). Note that this is not equivalent to Marshal.AllocHGlobal(), it uses a different heap. It assumes that the native code was written to support interop well and used CoTaskMemAlloc(), it uses the heap that is dedicated to COM interop.

It's not blittable but I have MarshalAs attributes set...

That is the gritty detail that explains why you have to make it awkward. The pinvoke marshaller does not want to solve this problem since it has to marshal a copy and there is too much risk automatically releasing the storage for the object and its members. Using [MarshalAs] is unnecessary and does not make the code better, simply change the return type to IntPtr. Ready to pass to Marshal.PtrToStructure() and whatever memory release function you need.

I have to talk about the reason that Marshal.AllocHGlobal() appeared to be correct. It did not used to be, but has changed in recent Windows and VS versions. There was a big design change in Win8 and VS2012. The OS no longer creates separate heaps that Marshal.AllocHGlobal and Marshal.AllocCoTaskMem allocate from. It is now a single heap, the default process heap (GetProcessHeap() returns it). And there was a corresponding change in the CRT included with VS2012, it now also uses GetProcessHeap() instead of creating its own heap with HeapCreate().

Very big change and not publicized widely. Microsoft has not released any motivation for this that I know of, I assume that the basic reason was WinRT (aka UWP), lots of memory management nastiness to get C++, C# and Javascript code to work together seamlessly. This is quite convenient to everybody that has to write interop code, you can now assume that Marshal.FreeHGlobal() gets the job done. Or Marshal.FreeCoTaskMem() like the pinvoke marshaller uses. Or free() like the native code would use, no difference anymore.

But also a significant risk, you can no longer assume that the code is bug-free when it works well on your dev machine and must re-test on Win7. You get an AccessViolationException if you guessed wrong about the release function. It is worse if you also have to support XP or Win2003, no crash at all but you'll silently leak memory. Very hard to deal with that when it happens since you can't get ahead without changing the native code. Best to get it right early.

来源：https://stackoverflow.com/questions/51618837/c-sharp-marshal-unmanaged-pointer-return-type

标签

.net

interop

marshalling

unmanaged