Often I find myself having to represent a structure that consists of very small values. For example, Foo
has 4 values, a, b, c, d
that, range from
If what you're after is efficiency of space, then you should consider avoiding struct
s altogether. The compiler will insert padding into your struct representation as necessary to make its size a multiple of its alignment requirement, which might be as much as 16 bytes (but is more likely to be 4 or 8 bytes, and could after all be as little as 1 byte).
If you use a struct anyway, then which to use depends on your implementation. If @dbush's bitfield approach yields one-byte structures then it's hard to beat that. If your implementation is going to pad the representation to at least four bytes no matter what, however, then this is probably the one to use:
struct Foo {
char a;
char b;
char c;
char d;
};
Or I guess I would probably use this variant:
struct Foo {
uint8_t a;
uint8_t b;
uint8_t c;
uint8_t d;
};
Since we're supposing that your struct is taking up a minimum of four bytes, there is no point in packing the data into smaller space. That would be counter-productive, in fact, because it would also make the processor do the extra work packing and unpacking the values within.
For handling large amounts of data, making efficient use of the CPU cache provides a far greater win than avoiding a few integer operations. If your data usage pattern is at least somewhat systematic (e.g. if after accessing one element of your erstwhile struct array, you are likely to access a nearby one next) then you are likely to get a boost in both space efficiency and speed by packing the data as tightly as you can. Depending on your C implementation (or if you want to avoid implementation dependency), you might need to achieve that differently -- for instance, via an array of integers. For your particular example of four fields, each requiring two bits, I would consider representing each "struct" as a uint8_t
instead, for a total of 1 byte each.
Maybe something like this:
#include
#define NUMBER_OF_FOOS 1000000000
#define A 0
#define B 2
#define C 4
#define D 6
#define SET_FOO_FIELD(foos, index, field, value) \
((foos)[index] = (((foos)[index] & ~(3 << (field))) | (((value) & 3) << (field))))
#define GET_FOO_FIELD(foos, index, field) (((foos)[index] >> (field)) & 3)
typedef uint8_t foo;
foo all_the_foos[NUMBER_OF_FOOS];
The field name macros and access macros provide a more legible -- and adjustable -- way to access the individual fields than would direct manipulation of the array (but be aware that these particular macros evaluate some of their arguments more than once). Every bit is used, giving you about as good cache usage as it is possible to achieve through choice of data structure alone.