How can I serialize doubles and floats in C?
I have the following code for serializing shorts, ints, and chars.
unsigned char * serialize_char(unsign
The portable way: use frexp
to serialize (convert to integer mantissa and exponent) and ldexp
to deserialize.
The simple way: assume in 2010 any machine you care about uses IEEE float, declare a union with a float
element and a uint32_t
element, and use your integer serialization code to serialize the float.
The binary-file-haters way: serialize everything as text, floats included. Use the "%a"
printf format specifier to get a hex float, which is always expressed exactly (provided you don't limit the precision with something like "%.4a"
) and not subject to rounding errors. You can read these back with strtod
or any of the scanf
family of functions.
Following your update, you mention the data is to be transmitted using UDP and ask for best practices. I would highly recommend sending the data as text, perhaps even with some markup added (XML). Debugging endian-related errors across a transmission-line is a waste of everybody's time
Just my 2 cents on the "best practices" part of your question
You can portably serialize in IEEE-754 regardless of the native representation:
int fwriteieee754(double x, FILE * fp, int bigendian)
{
int shift;
unsigned long sign, exp, hibits, hilong, lowlong;
double fnorm, significand;
int expbits = 11;
int significandbits = 52;
/* zero (can't handle signed zero) */
if(x == 0) {
hilong = 0;
lowlong = 0;
goto writedata;
}
/* infinity */
if(x > DBL_MAX) {
hilong = 1024 + ((1 << (expbits - 1)) - 1);
hilong <<= (31 - expbits);
lowlong = 0;
goto writedata;
}
/* -infinity */
if(x < -DBL_MAX) {
hilong = 1024 + ((1 << (expbits - 1)) - 1);
hilong <<= (31 - expbits);
hilong |= (1 << 31);
lowlong = 0;
goto writedata;
}
/* NaN - dodgy because many compilers optimise out this test
* isnan() is C99, POSIX.1 only, use it if you will.
*/
if(x != x) {
hilong = 1024 + ((1 << (expbits - 1)) - 1);
hilong <<= (31 - expbits);
lowlong = 1234;
goto writedata;
}
/* get the sign */
if(x < 0) {
sign = 1;
fnorm = -x;
} else {
sign = 0;
fnorm = x;
}
/* get the normalized form of f and track the exponent */
shift = 0;
while(fnorm >= 2.0) {
fnorm /= 2.0;
shift++;
}
while(fnorm < 1.0) {
fnorm *= 2.0;
shift--;
}
/* check for denormalized numbers */
if(shift < -1022) {
while(shift < -1022) {
fnorm /= 2.0;
shift++;
}
shift = -1023;
} else {
/* take the significant bit off mantissa */
fnorm = fnorm - 1.0;
}
/* calculate the integer form of the significand */
/* hold it in a double for now */
significand = fnorm * ((1LL << significandbits) + 0.5f);
/* get the biased exponent */
exp = shift + ((1 << (expbits - 1)) - 1); /* shift + bias */
/* put the data into two longs */
hibits = (long)(significand / 4294967296); /* 0x100000000 */
hilong = (sign << 31) | (exp << (31 - expbits)) | hibits;
lowlong = (unsigned long)(significand - hibits * 4294967296);
writedata:
/* write the bytes out to the stream */
if(bigendian) {
fputc((hilong >> 24) & 0xFF, fp);
fputc((hilong >> 16) & 0xFF, fp);
fputc((hilong >> 8) & 0xFF, fp);
fputc(hilong & 0xFF, fp);
fputc((lowlong >> 24) & 0xFF, fp);
fputc((lowlong >> 16) & 0xFF, fp);
fputc((lowlong >> 8) & 0xFF, fp);
fputc(lowlong & 0xFF, fp);
} else {
fputc(lowlong & 0xFF, fp);
fputc((lowlong >> 8) & 0xFF, fp);
fputc((lowlong >> 16) & 0xFF, fp);
fputc((lowlong >> 24) & 0xFF, fp);
fputc(hilong & 0xFF, fp);
fputc((hilong >> 8) & 0xFF, fp);
fputc((hilong >> 16) & 0xFF, fp);
fputc((hilong >> 24) & 0xFF, fp);
}
return ferror(fp);
}
In machines using IEEE-754 (ie. the common case), all you'll need to do to get the number is an fread()
. Otherwise, decode the bytes yourself (sign * 2^(exponent-127) * 1.mantissa)
.
Note: when serializing in systems where the native double is more precise than the IEEE double, you might encounter off-by-one errors in the low bit.
Hope this helps.
I remember first seeing the cast used in my example below in the good old Quake source code of the "rsqrt" routine, containing the coolest comment I'd seen at the time (Google it, you'll like it)
unsigned char * serialize_float(unsigned char *buffer, float value)
{
unsigned int ivalue = *((unsigned int*)&value); // warning assumes 32-bit "unsigned int"
buffer[0] = ivalue >> 24;
buffer[1] = ivalue >> 16;
buffer[2] = ivalue >> 8;
buffer[3] = ivalue;
return buffer + 4;
}
I hope I've understood your question (and example code) correctly. Let me know if this was usefull?
This packs a floating point value into an int
and long long
pair, which you can then serialise with your other functions. The unpack()
function is used to deserialise.
The pair of numbers represent the exponent and fractional part of the number respectively.
#define FRAC_MAX 9223372036854775807LL /* 2**63 - 1 */
struct dbl_packed
{
int exp;
long long frac;
};
void pack(double x, struct dbl_packed *r)
{
double xf = fabs(frexp(x, &r->exp)) - 0.5;
if (xf < 0.0)
{
r->frac = 0;
return;
}
r->frac = 1 + (long long)(xf * 2.0 * (FRAC_MAX - 1));
if (x < 0.0)
r->frac = -r->frac;
}
double unpack(const struct dbl_packed *p)
{
double xf, x;
if (p->frac == 0)
return 0.0;
xf = ((double)(llabs(p->frac) - 1) / (FRAC_MAX - 1)) / 2.0;
x = ldexp(xf + 0.5, p->exp);
if (p->frac < 0)
x = -x;
return x;
}
For the narrow question about float
, note that you probably end up assuming that both ends of the wire are using the same representation for floating point. This might be safe today given the pervasive use of IEEE-754, but note that some current DSPs (I believe blackfins) use a different representation. In the olden days, there were at least as many representations for floating point as there were manufactures of hardware and libraries so this was a bigger issue.
Even with the same representation, it might not be stored with the same byte order. That will necessitate deciding on a byte order on the wire, and tweaked code at each end. Either the type-punned pointer cast or the union will work in practice. Both are invoking Implementation Defined behavior, but as long as you check and test that is not a big deal.
That said, text is often your friend for transferring floating point between platforms. The trick is to not use too many more characters that are really needed to convert it back.
All in all, I'd recommend giving some serious consideration to using a library such as XDR that is robust, been around for a while, and has been rubbed up against all of the sharp corner and edge cases.
If you insist on rolling your own, take care about subtle issues like whether int
is 16 bits, 32 bits, or even 64 bits in addition to representation of float
and double
.