How to loop over a binary file in Python in chunks

后端 未结 3 758
青春惊慌失措
青春惊慌失措 2021-01-22 00:13

I\'m trying to use Python to loop over a long binary file filled with 8-byte records.

Each record has the format [ uint16 | uint16 | uint32 ]
(which

相关标签:
3条回答
  • 2021-01-22 00:45

    The iter builtin, if passed a callable and a sentinel value will call the callable repeatedly until the sentinel value is returned.

    So you can create a partial function with functools.partial (or use a lambda) and pass it to iter, like this:

    with open('foo.bin', 'rb') as f:
        chunker = functools.partial(f.read, 8)
        for chunk in iter(chunker, b''):      # Read 8 byte chunks until empty byte returned
            # Do stuff with chunk
    
    0 讨论(0)
  • 2021-01-22 00:48

    f.read(len) only returns a byte string. Then raw will be a single byte.

    The correct way of looping is:

    with open(fname, 'rb') as f:
        while True:
            raw = f.read(8)
            if len(raw)!=8:
                break # ignore the incomplete "record" if any
            record = struct.unpack("HHI", raw )
            print(record)
    
    0 讨论(0)
  • 2021-01-22 00:56

    I've never used this before, but it looks like an initialization issue:

       with open(fname, "rb") as f:
            fmt = 'HHI'
            raw=struct.pack(fmt,1,2,3)
            len=struct.calcsize(fmt)
            print(len)               # This shows 8, as expected 
            for raw in f.read(len):  # Expect this should read 8 bytes into raw
                print(type(raw))     # This says raw is an 'int', not a byte-array
                record=struct.unpack(fmt, raw ) # "TypeError: a bytes-like object is required, not 'int'"
                print(record)
    

    You may want to look at iter_unpack() for optimization if you have adequate ram.

    Note that in 3.7, the default value changes from bytes to string. see near end of page https://docs.python.org/3/library/struct.html#struct.pack

    0 讨论(0)
提交回复
热议问题