I was reading a binary file in python like this:
from struct import unpack ns = 1000 f = open("binary_file", 'rb') while True: data = f.read(ns * 4) if data == '': break unpacked = unpack(">%sf" % ns, data) print str(unpacked)
when I realized unpack(">f", str)
is for unpacking IEEE floating point, my data is IBM 32-bit float point numbers
My question is: How can I impliment my unpack
to unpack IBM 32-bit float point type numbers?
I don't mind using like ctypes
to extend python to get better performance.
EDIT: I did some searching: http://mail.scipy.org/pipermail/scipy-user/2009-January/019392.html
This looks very promising, but I want to get more efficient: there are potential tens of thousands of loops.
EDIT: posted answer below. Thanks for the tip.
I think I understood it: first unpack the string to unsigned 4 byte integer, and then use this function:
def ibm2ieee(ibm): """ Converts an IBM floating point number into IEEE format. :param: ibm - 32 bit unsigned integer: unpack('>L', f.read(4)) """ if ibm == 0: return 0.0 sign = ibm >> 31 & 0x01 exponent = ibm >> 24 & 0x7f mantissa = (ibm & 0x00ffffff) / float(pow(2, 24)) return (1 - 2 * sign) * mantissa * pow(16, exponent - 64)
Thanks for all who helped!
IBM Floating Point Architecture, how to encode and decode: http://en.wikipedia.org/wiki/IBM_Floating_Point_Architecture
My solution: I wrote a class, I think in this way, it can be a bit faster, because used Struct object, so that the unpack fmt is compiled only once. EDIT: also because it's unpacking size*bytes all at once, and unpacking can be an expensive operation.
from struct import Struct class StructIBM32(object): """ see example in: http://en.wikipedia.org/wiki/IBM_Floating_Point_Architecture#An_Example >>> import struct >>> c = StructIBM32(1) >>> bit = '11000010011101101010000000000000' >>> c.unpack(struct.pack('>L', int(bit, 2))) [-118.625] """ def __init__(self, size): self.p24 = float(pow(2, 24)) self.unpack32int = Struct(">%sL" % size).unpack def unpack(self, data): int32 = self.unpack32int(data) return [self.ibm2ieee(i) for i in int32] def ibm2ieee(self, int32): if int32 == 0: return 0.0 sign = int32 >> 31 & 0x01 exponent = int32 >> 24 & 0x7f mantissa = (int32 & 0x00ffffff) / self.p24 return (1 - 2 * sign) * mantissa * pow(16, exponent - 64) if __name__ == "__main__": import doctest doctest.testmod()