Portable C binary serialization primitives

本小妞迷上赌 提交于 2019-12-03 12:11:13

I've never used them, but I think Google's Protocol Buffers satisfy your requirements.

  • 64 bit types, signed/unsigned, and floating point types are all supported.
  • The API generated is typesafe
  • Serialisation can be done to/from streams

This tutorial seems like a pretty good introduction, and you can read about the actual binary storage format here.


From their web page:

What Are Protocol Buffers?

Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages – Java, C++, or Python.

There's no official implementation in pure C (only C++), but there are two C ports that might fit your needs:

I don't know how they fare in the presence of non-8 bit bytes, but it should be relatively easy to find out.

In my opinion the main drawback of functions like htonl() is that they do only half the work what is serialization. They only flip the bytes in a multi-byte integer if you machine is little endian. The other important thing that must be done when serializing is handling alignment, and these functions don't do that.

A lot of CPUs are not capable of (efficiently) accessing multi-byte integers which aren't stored at an memory location which address isn't a multiple of the size of the integer in bytes. This is the reason to never ever use struct overlays to (de)serialize network packets. I'm not sure if this is what you mean by 'in-place conversion'.

I work a lot with embedded systems, and I've functions in my own library which I always use when generating or parsing network packets (or any other I/O: disk, RS232, etc):

/* Serialize an integer into a little or big endian byte buffer, resp. */
void SerializeLeInt(uint64_t value, uint8_t *buffer, size_t nrBytes);
void SerializeBeInt(uint64_t value, uint8_t *buffer, size_t nrBytes);

/* Deserialize an integer from a little or big endian byte buffer, resp. */
uint64_t DeserializeLeInt(const uint8_t *buffer, size_t nrBytes);
uint64_t DeserializeBeInt(const uint8_t *buffer, size_t nrBytes);

Along with these functions there are a bunch of macros defined suchs as:

#define SerializeBeInt16(value, buffer)     SerializeBeInt(value, buffer, sizeof(int16_t))
#define SerializeBeUint16(value, buffer)    SerializeBeInt(value, buffer, sizeof(uint16_t))
#define DeserializeBeInt16(buffer)          DeserializeBeType(buffer, int16_t)
#define DeserializeBeUint16(buffer)         DeserializeBeType(buffer, uint16_t)

The (de)serialize functions read or write the values byte by byte, so alignment problems will not occur. You don't need to worry about signedness either. In the first place all systems these days use 2s complement (besides a few ADCs maybe, but then you wouldn't use these functions). However it should even work on a system using 1s complement because (as far as I know) a signed integer is converted to 2s complement when casted to unsigned (and the functions accept/return unsigned integers).

Another argument of you is they depend on 8-bit bytes and the presence of exact-size uint_N_t. This also counts for my functions, but in my opinion this is not a problem (those types are always defined for the systems and their compilers I work with). You could tweak the function prototypes to use unsigned char instead of uint8_t and something like long long or uint_least64_t instead of uint64_t if you like.

See xdr library and XDR standards RFC-1014 RFC-4506

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!