Here is the code I normally use to get aligned memory with Visual Studio and GCC
inline void* aligned_malloc(size_t size, size_t align) {
void *result;
The first function you propose would indeed work fine.
Your "homebrew" function also works, but has the drawback that if the value is already aligned, you have just wasted 15 bytes. May not matter sometimes, but the OS may well be able to provide memory that is correctly allocated without any waste (and if it needs to be aligned to 256 or 4096 bytes, you risk wasting a lot of memory by adding "alignment-1" bytes).