Here is the code I normally use to get aligned memory with Visual Studio and GCC
inline void* aligned_malloc(size_t size, size_t align) {
void *result;
If you compiler supports it, C++11 adds a std::align
function to do runtime pointer alignment. You could implement your own malloc/free like this (untested):
template<std::size_t Align>
void *aligned_malloc(std::size_t size)
{
std::size_t space = size + (Align - 1);
void *ptr = malloc(space + sizeof(void*));
void *original_ptr = ptr;
char *ptr_bytes = static_cast<char*>(ptr);
ptr_bytes += sizeof(void*);
ptr = static_cast<void*>(ptr_bytes);
ptr = std::align(Align, size, ptr, space);
ptr_bytes = static_cast<void*>(ptr);
ptr_bytes -= sizeof(void*);
std::memcpy(ptr_bytes, original_ptr, sizeof(void*));
return ptr;
}
void aligned_free(void* ptr)
{
void *ptr_bytes = static_cast<void*>(ptr);
ptr_bytes -= sizeof(void*);
void *original_ptr;
std::memcpy(&original_ptr, ptr_bytes, sizeof(void*));
std::free(original_ptr);
}
Then you don't have to keep the original pointer value around to free it. Whether this is 100% portable I'm not sure, but I hope someone will correct me if not!
As long as you're ok with having to call a special function to do the freeing, your approach is okay. I would do your #ifdef
s the other way around though: start with the standards-specified options and fall back to platform-specific ones. For example
__STDC_VERSION__ >= 201112L
use aligned_alloc
._POSIX_VERSION >= 200112L
use posix_memalign
._MSC_VER
is defined, use the Windows stuff.malloc
/free
and disable SSE/AVX code.The problem is harder if you want to be able to pass the allocated pointer to free
; that's valid on all the standard interfaces, but not on Windows and not necessarily with the legacy memalign
function some unix-like systems have.
Here are my 2 cents:
temp = new unsigned char*[num];
AlignedBuffers = new unsigned char*[num];
for (int i = 0; i<num; i++)
{
temp[i] = new unsigned char[bufferSize +15];
AlignedBuffers[i] = reinterpret_cast<unsigned char*>((reinterpret_cast<size_t>
(temp[i% num]) + 15) & ~15);// 16 bit alignment in preperation for SSE
}
The first function you propose would indeed work fine.
Your "homebrew" function also works, but has the drawback that if the value is already aligned, you have just wasted 15 bytes. May not matter sometimes, but the OS may well be able to provide memory that is correctly allocated without any waste (and if it needs to be aligned to 256 or 4096 bytes, you risk wasting a lot of memory by adding "alignment-1" bytes).
Here is a fixed of user2093113's sample, the direct code didn't build for me (void* unknown size). I also put it in a template class overriding operator new/delete so you don't have to do the allocation and call placement new.
#include <memory>
template<std::size_t Alignment>
class Aligned
{
public:
void* operator new(std::size_t size)
{
std::size_t space = size + (Alignment - 1);
void *ptr = malloc(space + sizeof(void*));
void *original_ptr = ptr;
char *ptr_bytes = static_cast<char*>(ptr);
ptr_bytes += sizeof(void*);
ptr = static_cast<void*>(ptr_bytes);
ptr = std::align(Alignment, size, ptr, space);
ptr_bytes = static_cast<char*>(ptr);
ptr_bytes -= sizeof(void*);
std::memcpy(ptr_bytes, &original_ptr, sizeof(void*));
return ptr;
}
void operator delete(void* ptr)
{
char *ptr_bytes = static_cast<char*>(ptr);
ptr_bytes -= sizeof(void*);
void *original_ptr;
std::memcpy(&original_ptr, ptr_bytes, sizeof(void*));
std::free(original_ptr);
}
};
Use it like this :
class Camera : public Aligned<16>
{
};
Didn't test the cross-platform-ness of this code yet.