I am new to optimizing code with SSE/SSE2 instructions and until now I have not gotten very far. To my knowledge a common SSE-optimized function would look like this:
Other answers suggest an AND operation with low bits set, and comparing to zero.
But a more straight-forward test would be to do a MOD with the desired alignment value, and compare to zero.
#define ALIGNMENT_VALUE 16u if (((uintptr_t)ptr % ALIGNMENT_VALUE) == 0) { // ptr is aligned }