Parsing binary data into ctypes Structure object via readinto()

前端 未结 1 1887
南旧
南旧 2020-12-30 10:47

I\'m trying to handle a binary format, following the example here:

http://dabeaz.blogspot.jp/2009/08/python-binary-io-handling.html

>>> from         


        
1条回答
  •  半阙折子戏
    2020-12-30 11:22

    This line definition is actually for defining a bitfield:

    ...
    ("more_funky_numbers_7bytes", c_uint, 56),
    ...
    

    which is wrong here. The size of a bitfield should be less than or equals the size of the type, so c_uint should be at most 32, one extra bit will raise the exception:

    ValueError: number of bits invalid for bit field
    

    Example of using the bitfield:

    from ctypes import *
    
    class MyStructure(Structure):
        _fields_ = [
            # c_uint8 is 8 bits length
            ('a', c_uint8, 4), # first 4 bits of `a`
            ('b', c_uint8, 2), # next 2 bits of `a`
            ('c', c_uint8, 2), # next 2 bits of `a`
            ('d', c_uint8, 2), # since we are beyond the size of `a`
                               # new byte will be create and `d` will
                               # have the first two bits
        ]
    
    mystruct = MyStructure()
    
    mystruct.a = 0b0000
    mystruct.b = 0b11
    mystruct.c = 0b00
    mystruct.d = 0b11
    
    v = c_uint16()
    
    # copy `mystruct` into `v`, I use Windows
    cdll.msvcrt.memcpy(byref(v), byref(mystruct), sizeof(v))
    
    print sizeof(mystruct) # 2 bytes, so 6 bits are left floating, you may
                           # want to memset with zeros
    print bin(v.value)     # 0b1100110000
    

    what you need is 7 bytes so what you endup doing is correct:

    ...
    ("more_funky_numbers_7bytes", c_byte * 7),
    ...
    

    As for the size for the structure, It's going to be 52, I extra byte will be padded to align the structure on 4 bytes on 32 bit processor or 8 bytes on 64 bits. Here:

    from ctypes import *
    
    class BinaryHeader(BigEndianStructure):
        _fields_ = [
            ("sequence_number_4bytes", c_uint),
            ("ascii_text_32bytes", c_char * 32),
            ("timestamp_4bytes", c_uint),
            ("more_funky_numbers_7bytes", c_byte * 7),
            ("some_flags_1byte", c_byte),
            ("other_flags_1byte", c_byte),
            ("payload_length_2bytes", c_ushort),
        ]
    
    mystruct = BinaryHeader(
        0x11111111,
        '\x22' * 32,
        0x33333333,
        (c_byte * 7)(*([0x44] * 7)),
        0x55,
        0x66,
        0x7777
    )
    
    print sizeof(mystruct)
    
    with open('data.txt', 'wb') as f:
        f.write(mystruct)
    

    The extra byte is padded between other_flags_1byte and payload_length_2bytes in the file:

    00000000 11 11 11 11 ....
    00000004 22 22 22 22 """"
    00000008 22 22 22 22 """"
    0000000C 22 22 22 22 """"
    00000010 22 22 22 22 """"
    00000014 22 22 22 22 """"
    00000018 22 22 22 22 """"
    0000001C 22 22 22 22 """"
    00000020 22 22 22 22 """"
    00000024 33 33 33 33 3333
    00000028 44 44 44 44 DDDD
    0000002C 44 44 44 55 DDDU
    00000030 66 00 77 77 f.ww
                ^
             extra byte
    

    This is an issue when it comes to the file formats and network protocols. To change it pack it by 1:

     ...
    class BinaryHeader(BigEndianStructure):
        _pack_ = 1
        _fields_ = [
            ("sequence_number_4bytes", c_uint),
    ...
    

    the file will be:

    00000000 11 11 11 11 ....
    00000004 22 22 22 22 """"
    00000008 22 22 22 22 """"
    0000000C 22 22 22 22 """"
    00000010 22 22 22 22 """"
    00000014 22 22 22 22 """"
    00000018 22 22 22 22 """"
    0000001C 22 22 22 22 """"
    00000020 22 22 22 22 """"
    00000024 33 33 33 33 3333
    00000028 44 44 44 44 DDDD
    0000002C 44 44 44 55 DDDU
    00000030 66 77 77    fww 
    

    As for struct, it won't make it easier in your case. Sadly it doesn't support nested tuples in the format. For example here:

    >>> from struct import *
    >>>
    >>> data = '\x11\x11\x11\x11\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22
    \x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x33
    \x33\x33\x33\x44\x44\x44\x44\x44\x44\x44\x55\x66\x77\x77'
    >>>
    >>> BinaryHeader = Struct('>I32cI7BBBH')
    >>>
    >>> BinaryHeader.unpack(data)
    (286331153, '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"'
    , '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"'
    , '"', '"', 858993459, 68, 68, 68, 68, 68, 68, 68, 85, 102, 30583)
    >>>
    

    This result cannot be used namedtuple, you still have parse it based on the index. It would work if you can do something like '>I(32c)(I)(7B)(B)(B)H'. This feature has been requested here (Extend struct.unpack to produce nested tuples) since 2003 but nothing is done since.

    0 讨论(0)
提交回复
热议问题