I did some googling and couldn\'t find any good article on this question. What should I watch out for when implementing an app that I want to be endian-agnostic?
Several answers have covered file IO, which is certainly the most common endian concern. I'll touch on one not-yet-mentioned: Unions.
The following union is a common tool in SIMD/SSE programming, and is not endian-friendly:
union uint128_t {
_m128i dq;
uint64_t dd[2];
uint32_t dw[4];
uint16_t dh[8];
uint8_t db[16];
};
Any code accessing the dd/dw/dh/db forms will be doing so in endian-specific fashion. On 32-bit CPUs it is also somewhat common to see simpler unions that allow more easily breaking 64-bit arithmetic into 32-bit portions:
union u64_parts {
uint64_t dd;
uint32_t dw[2];
};
Since in this usage case is it rare (if ever) that you want to iterate over each element of the union, I prefer to write such unions as this:
union u64_parts {
uint64_t dd;
struct {
#ifdef BIG_ENDIAN
uint32_t dw2, dw1;
#else
uint32_t dw1, dw2;
#endif
}
};
The result is implicit endian-swapping for any code that accesses dw1/dw2 directly. The same design approach can be used for the 128-bit SIMD datatype above as well, though it ends up being considerably more verbose.
Disclaimer: Union use is often frowned upon because of the loose standards definitions regarding structure padding and alignment. I find unions very useful and have used them extensively, and I haven't run into any cross-compatibility issues in a very long time (15+ yrs). Union padding/alignment will behave in an expected and consistent fashion for any current compiler targeting x86, ARM, or PowerPC.