Marshalling a big-endian byte collection into a struct in order to pull out values

泄露秘密 提交于 2019-11-27 04:40:46
Onur

Here's another solution for swapping endianness.

It's adjusted from Adam Robinsons solution here: https://stackoverflow.com/a/2624377/1254743

It's even capable of handling nested structs.

public static class FooTest
{
    [StructLayout(LayoutKind.Sequential, Pack = 1)]
    public struct Foo2
    {
        public byte b1;
        public short s;
        public ushort S;
        public int i;
        public uint I;
        public long l;
        public ulong L;
        public float f;
        public double d;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 10)]
        public string MyString;
    }

    [StructLayout(LayoutKind.Sequential, Pack = 1)]
    public struct Foo
    {
        public byte b1;
        public short s;
        public ushort S;
        public int i;
        public uint I;
        public long l;
        public ulong L;
        public float f;
        public double d;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 10)]
        public string MyString;
        public Foo2 foo2;
    }

    public static void test()
    {
        Foo2 sample2 = new Foo2()
        {
            b1 = 0x01,
            s = 0x0203,
            S = 0x0405,
            i = 0x06070809,
            I = 0x0a0b0c0d,
            l = 0xe0f101112131415,
            L = 0x161718191a1b1c,
            f = 1.234f,
            d = 4.56789,
            MyString = @"123456789", // null terminated => only 9 characters!
        };

        Foo sample = new Foo()
        {
            b1 = 0x01,
            s = 0x0203,
            S = 0x0405,
            i = 0x06070809,
            I = 0x0a0b0c0d,
            l = 0xe0f101112131415,
            L = 0x161718191a1b1c,
            f = 1.234f,
            d = 4.56789,
            MyString = @"123456789", // null terminated => only 9 characters!
            foo2 = sample2,
        };

        var bytes_LE = Dummy.StructToBytes(sample, Endianness.LittleEndian);
        var restoredLEAsLE = Dummy.BytesToStruct<Foo>(bytes_LE, Endianness.LittleEndian);
        var restoredLEAsBE = Dummy.BytesToStruct<Foo>(bytes_LE, Endianness.BigEndian);

        var bytes_BE = Dummy.StructToBytes(sample, Endianness.BigEndian);
        var restoredBEAsLE = Dummy.BytesToStruct<Foo>(bytes_BE, Endianness.LittleEndian);
        var restoredBEAsBE = Dummy.BytesToStruct<Foo>(bytes_BE, Endianness.BigEndian);

        Debug.Assert(sample.Equals(restoredLEAsLE));
        Debug.Assert(sample.Equals(restoredBEAsBE));
        Debug.Assert(restoredBEAsLE.Equals(restoredLEAsBE));
    }

    public enum Endianness
    {
        BigEndian,
        LittleEndian
    }

    private static void MaybeAdjustEndianness(Type type, byte[] data, Endianness endianness, int startOffset = 0)
    {
        if ((BitConverter.IsLittleEndian) == (endianness == Endianness.LittleEndian))
        {
            // nothing to change => return
            return;
        }

        foreach (var field in type.GetFields())
        {
            var fieldType = field.FieldType;
            if (field.IsStatic)
                // don't process static fields
                continue;

            if (fieldType == typeof(string)) 
                // don't swap bytes for strings
                continue;

            var offset = Marshal.OffsetOf(type, field.Name).ToInt32();

            // handle enums
            if (fieldType.IsEnum)
                fieldType = Enum.GetUnderlyingType(fieldType);

            // check for sub-fields to recurse if necessary
            var subFields = fieldType.GetFields().Where(subField => subField.IsStatic == false).ToArray();

            var effectiveOffset = startOffset + offset;

            if (subFields.Length == 0)
            {
                Array.Reverse(data, effectiveOffset, Marshal.SizeOf(fieldType));
            }
            else
            {
                // recurse
                MaybeAdjustEndianness(fieldType, data, endianness, effectiveOffset);
            }
        }
    }

    internal static T BytesToStruct<T>(byte[] rawData, Endianness endianness) where T : struct
    {
        T result = default(T);

        MaybeAdjustEndianness(typeof(T), rawData, endianness);

        GCHandle handle = GCHandle.Alloc(rawData, GCHandleType.Pinned);

        try
        {
            IntPtr rawDataPtr = handle.AddrOfPinnedObject();
            result = (T)Marshal.PtrToStructure(rawDataPtr, typeof(T));
        }
        finally
        {
            handle.Free();
        }

        return result;
    }

    internal static byte[] StructToBytes<T>(T data, Endianness endianness) where T : struct
    {
        byte[] rawData = new byte[Marshal.SizeOf(data)];
        GCHandle handle = GCHandle.Alloc(rawData, GCHandleType.Pinned);
        try
        {
            IntPtr rawDataPtr = handle.AddrOfPinnedObject();
            Marshal.StructureToPtr(data, rawDataPtr, false);
        }
        finally
        {
            handle.Free();
        }

        MaybeAdjustEndianness(typeof(T), rawData, endianness);

        return rawData;
    }

}

As alluded to in my comment on @weismat's answer, there is an easy way to achieve big-endian structuring. It involves a double-reversal: the original bytes are reversed entirely, then the struct itself is the reversal of the original (big-endian) data format.

The fooLe and fooBe in Main will have the same values for all fields. (Normally, the little-endian struct and bytes wouldn't be present, of course, but this clearly shows the relationship between the byte orders.)

NOTE: See updated code including how to get bytes back out of the struct.

public void Main()
{
    var beBytes = new byte[] {
        0x80, 
        0x80,0, 
        0x80,0, 
        0x80,0,0,0, 
        0x80,0,0,0,
        0x80,0,0,0,0,0,0,0, 
        0x80,0,0,0,0,0,0,0, 
        0x3F,0X80,0,0, // float of 1 (see http://en.wikipedia.org/wiki/Endianness#Floating-point_and_endianness)
        0x3F,0xF0,0,0,0,0,0,0, // double of 1
        0,0,0,0x67,0x6E,0x69,0x74,0x73,0x65,0x54 // Testing\0\0\0
    };
    var leBytes = new byte[] {
        0x80, 
        0,0x80,
        0,0x80, 
        0,0,0,0x80,
        0,0,0,0x80, 
        0,0,0,0,0,0,0,0x80, 
        0,0,0,0,0,0,0,0x80, 
        0,0,0x80,0x3F, // float of 1
        0,0,0,0,0,0,0xF0,0x3F, // double of 1
        0x54,0x65,0x73,0x74,0x69,0x6E,0x67,0,0,0 // Testing\0\0\0
    };
    Foo fooLe = ByteArrayToStructure<Foo>(leBytes).Dump("LE");
    FooReversed fooBe = ByteArrayToStructure<FooReversed>(beBytes.Reverse().ToArray()).Dump("BE");  
}

[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct Foo  {
    public byte b1;
    public short s;
    public ushort S;
    public int i;
    public uint I;
    public long l;
    public ulong L;
    public float f;
    public double d;
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 10)]
    public string MyString;
}

[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct FooReversed  {
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 10)]
    public string MyString;
    public double d;
    public float f;
    public ulong L;
    public long l;
    public uint I;
    public int i;
    public ushort S;
    public short s;
    public byte b1;
}

T ByteArrayToStructure<T>(byte[] bytes) where T: struct 
{
    GCHandle handle = GCHandle.Alloc(bytes, GCHandleType.Pinned);
    T stuff = (T)Marshal.PtrToStructure(handle.AddrOfPinnedObject(),typeof(T));
    handle.Free();
    return stuff;
}

It seems there must be a more elegant solution, but this should at least get you going:

    static T ByteArrayToStructureBigEndian<T>(byte[] bytes) where T : struct
    {
        GCHandle handle = GCHandle.Alloc(bytes, GCHandleType.Pinned);
        T stuff = (T)Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(T));
        handle.Free();
        System.Type t = stuff.GetType();
        FieldInfo[] fieldInfo = t.GetFields();
        foreach (FieldInfo fi in fieldInfo)
        {                 
            if (fi.FieldType == typeof(System.Int16))
            {
                // TODO
            }
            else if (fi.FieldType == typeof(System.Int32))
            {
                // TODO
            }
            else if (fi.FieldType == typeof(System.Int64))
            {
                // TODO
            }
            else if (fi.FieldType == typeof(System.UInt16))
            {
                UInt16 i16 = (UInt16)fi.GetValue(stuff);
                byte[] b16 = BitConverter.GetBytes(i16);
                byte[] b16r = b16.Reverse().ToArray();
                fi.SetValueDirect(__makeref(stuff), BitConverter.ToUInt16(b16r, 0);
            }
            else if (fi.FieldType == typeof(System.UInt32))
            {
                // TODO
            }
            else if (fi.FieldType == typeof(System.UInt64))
            {
                // TODO
            }
        }
        return stuff;
    }

I finally figured out a way that didn't involve reflection and is mostly user-friendly. It uses Mono's DataConverter class (source) which, unfortunately, is fairly buggy at this point. (For example, floats and doubles don't seem to work correctly, string parsing is broken, etc.)

The trick is to Unpack and re-pack the bytes as big-endian, which requires a string describing what types are in the byte array (see the last method).Also, byte alignment is tricky: there are four bytes in the struct instead of one because marshaling seems to rely on having 4-byte alignment (I still don't quite understand that part). (EDIT: I have found that adding Pack=1 to the StructLayout attribute usually takes care of byte-alignment issues.)

Note, this sample code was used in LINQPad - the Dump extension method just prints info about the object and returns the object (it's fluent).

public void Main()
{
    var beBytes = new byte[] {
        0x80, 
        0x80, 
        0x80, 
        0x80, 
        0x80,0, 
        0x80,0, 
        0x80,0,0,0, 
        0x80,0,0,0,
        0x80,0,0,0,0,0,0,0, 
        0x80,0,0,0,0,0,0,0, 
//      0,0,0x80,0x3F, // float of 1
//      0,0,0,0,0,0,0xF0,0x3F, // double of 1
        0x54,0x65,0x73,0x74,0x69,0x6E,0x67,0,0,0 // Testing\0\0\0
    };
    var leBytes = new byte[] {
        0x80, 
        0x80, 
        0x80, 
        0x80, 
        0,0x80,
        0,0x80, 
        0,0,0,0x80,
        0,0,0,0x80, 
        0,0,0,0,0,0,0,0x80, 
        0,0,0,0,0,0,0,0x80, 
//      0,0,0x80,0x3F, // float of 1
//      0,0,0,0,0,0,0xF0,0x3F, // double of 1
        0x54,0x65,0x73,0x74,0x69,0x6E,0x67,0,0,0 // Testing\0\0\0
    };
    Foo fooLe = ByteArrayToStructure<Foo>(leBytes).Dump("LE");
    Foo fooBe = ByteArrayToStructureBigEndian<Foo>(beBytes, 
        "bbbbsSiIlL"
//      + "fd" // float, then double
        +"9bb").Dump("BE");
    Assert.AreEqual(fooLe, fooBe);
}

[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct Foo  {
    public byte b1;
    public byte b2;
    public byte b3;
    public byte b4;
    public short s;
    public ushort S;
    public int i;
    public uint I;
    public long l;
    public ulong L;
//  public float f;
//  public double d;
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 10)]
    public string MyString;
}

T ByteArrayToStructure<T>(byte[] bytes) where T: struct 
{
    GCHandle handle = GCHandle.Alloc(bytes, GCHandleType.Pinned);
    T stuff = (T)Marshal.PtrToStructure(handle.AddrOfPinnedObject(),typeof(T));
    handle.Free();
    return stuff;
}

T ByteArrayToStructureBigEndian<T>(byte[] bytes, string description) where T: struct 
{
    byte[] buffer = bytes;
    IList unpacked = DataConverter.Unpack("^"+description, buffer, 0).Dump("unpacked");
    buffer = DataConverter.PackEnumerable("!"+description, unpacked).Dump("packed");
    return ByteArrayToStructure<T>(buffer);
}
kriss

I agree with @weismat and believe there is no solution.

What you show in your example is that you can access a raw byte buffer as if it were any OTHER structure without changing anything to it, not copying or moving data around, nothing. Just pinning it to avoid it to move around because of GC.

This is basically what you usually achieve in C by using a union type containing both your target structure and a byte array of the same size.

The good side is that it is really efficient.

That has several drawbacks, the main one being that you can only get access this way to data that are in the native machine order (be it LE or BE). Hence your ByteArrayToStructure is not really LE, it is only so because the processor underneath is LE. If you compile the same program on another target that happen to be BE, it works the other way and believe your byte array is BE.

Other drawbacks are that you must be very cautious with data alignment, be aware of possible padding, etc. and of course that there is no way to change byte order from LE to BE without moving data in bytes array (if you have a 16 bits integers only array as in your example this is merely swapping every two bytes).

I happened to have a similar problem and poundered not to use this solution because of the previous drawbacks and opted to hide my input structures behind accessors to hide access to the bytes array underneath. It may not be as elegant, but it is simple and also avoid to copy the buffer or move data in any way.

Have you tried MiscUtil? It has a utility class named EndianBitConverter to convert between big and little endian byte arrays.

From my point of view you just need to add an Array.Reverse() before the conversion of the byte array.

The traditional solution is to use ntohl() and ntohs().

typedef struct {
  long foo;
  short bar, baz;
  char xyzzy;
} Data;

Data d;
memcpy(&d, buffer, sizeof(Data));

d.foo = ntohl(d.foo);
d.bar = ntohs(d.bar);
d.baz = ntohs(d.baz);
// don't need to change d.xyxxy

The above works on any platform that has BSD Sockets, no matter whether it's big-endian, little-endian, or something utterly weird like a VAX. The reverse operation is done using hton*().

On big-endian platforms the functions are usually no-ops and should therefore be zero cost.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!