best cross-platform method to get aligned memory

浪尽此生 提交于 2019-12-17 15:37:12

问题


Here is the code I normally use to get aligned memory with Visual Studio and GCC

inline void* aligned_malloc(size_t size, size_t align) {
    void *result;
    #ifdef _MSC_VER 
    result = _aligned_malloc(size, align);
    #else 
     if(posix_memalign(&result, align, size)) result = 0;
    #endif
    return result;
}

inline void aligned_free(void *ptr) {
    #ifdef _MSC_VER 
        _aligned_free(ptr);
    #else 
      free(ptr);
    #endif

}

Is this code fine in general? I have also seen people use _mm_malloc, _mm_free. In most cases that I want aligned memory it's to use SSE/AVX. Can I use those functions in general? It would make my code a lot simpler.

Lastly, it's easy to create my own function to align memory (see below). Why then are there so many different common functions to get aligned memory (many of which only work on one platform)?

This code does 16 byte alignment.

float* array = (float*)malloc(SIZE*sizeof(float)+15);

// find the aligned position
// and use this pointer to read or write data into array
float* alignedArray = (float*)(((unsigned long)array + 15) & (~0x0F));

// dellocate memory original "array", NOT alignedArray
free(array);
array = alignedArray = 0;

See: http://www.songho.ca/misc/alignment/dataalign.html and How to allocate aligned memory only using the standard library?

Edit: In case anyone cares, I got the idea for my aligned_malloc() function from Eigen (Eigen/src/Core/util/Memory.h)

Edit: I just discovered that posix_memalign is undefined for MinGW. However, _mm_malloc works for Visual Studio 2012, GCC, MinGW, and the Intel C++ compiler so it seems to be the most convenient solution in general. It also requires using its own _mm_free function, although on some implementations you can pass pointers from _mm_malloc to the standard free / delete.


回答1:


The first function you propose would indeed work fine.

Your "homebrew" function also works, but has the drawback that if the value is already aligned, you have just wasted 15 bytes. May not matter sometimes, but the OS may well be able to provide memory that is correctly allocated without any waste (and if it needs to be aligned to 256 or 4096 bytes, you risk wasting a lot of memory by adding "alignment-1" bytes).




回答2:


As long as you're ok with having to call a special function to do the freeing, your approach is okay. I would do your #ifdefs the other way around though: start with the standards-specified options and fall back to platform-specific ones. For example

  1. If __STDC_VERSION__ >= 201112L use aligned_alloc.
  2. If _POSIX_VERSION >= 200112L use posix_memalign.
  3. If _MSC_VER is defined, use the Windows stuff.
  4. ...
  5. If all else fails, just use malloc/free and disable SSE/AVX code.

The problem is harder if you want to be able to pass the allocated pointer to free; that's valid on all the standard interfaces, but not on Windows and not necessarily with the legacy memalign function some unix-like systems have.




回答3:


Here is a fixed of user2093113's sample, the direct code didn't build for me (void* unknown size). I also put it in a template class overriding operator new/delete so you don't have to do the allocation and call placement new.

#include <memory>

template<std::size_t Alignment>
class Aligned
{
public:
    void* operator new(std::size_t size)
    {
        std::size_t space = size + (Alignment - 1);
        void *ptr = malloc(space + sizeof(void*));
        void *original_ptr = ptr;

        char *ptr_bytes = static_cast<char*>(ptr);
        ptr_bytes += sizeof(void*);
        ptr = static_cast<void*>(ptr_bytes);

        ptr = std::align(Alignment, size, ptr, space);

        ptr_bytes = static_cast<char*>(ptr);
        ptr_bytes -= sizeof(void*);
        std::memcpy(ptr_bytes, &original_ptr, sizeof(void*));

        return ptr;
    }

    void operator delete(void* ptr)
    {
        char *ptr_bytes = static_cast<char*>(ptr);
        ptr_bytes -= sizeof(void*);

        void *original_ptr;
        std::memcpy(&original_ptr, ptr_bytes, sizeof(void*));

        std::free(original_ptr);
    }
};

Use it like this :

class Camera : public Aligned<16>
{
};

Didn't test the cross-platform-ness of this code yet.




回答4:


If you compiler supports it, C++11 adds a std::align function to do runtime pointer alignment. You could implement your own malloc/free like this (untested):

template<std::size_t Align>
void *aligned_malloc(std::size_t size)
{
    std::size_t space = size + (Align - 1);
    void *ptr = malloc(space + sizeof(void*));
    void *original_ptr = ptr;

    char *ptr_bytes = static_cast<char*>(ptr);
    ptr_bytes += sizeof(void*);
    ptr = static_cast<void*>(ptr_bytes);

    ptr = std::align(Align, size, ptr, space);

    ptr_bytes = static_cast<void*>(ptr);
    ptr_bytes -= sizeof(void*);
    std::memcpy(ptr_bytes, original_ptr, sizeof(void*));

    return ptr;
}

void aligned_free(void* ptr)
{
    void *ptr_bytes = static_cast<void*>(ptr);
    ptr_bytes -= sizeof(void*);

    void *original_ptr;
    std::memcpy(&original_ptr, ptr_bytes, sizeof(void*));

    std::free(original_ptr);
}

Then you don't have to keep the original pointer value around to free it. Whether this is 100% portable I'm not sure, but I hope someone will correct me if not!




回答5:


Here are my 2 cents:

temp = new unsigned char*[num];
AlignedBuffers = new unsigned char*[num];
for (int i = 0; i<num; i++)
{
    temp[i] = new  unsigned char[bufferSize +15];
    AlignedBuffers[i] = reinterpret_cast<unsigned char*>((reinterpret_cast<size_t>
                        (temp[i% num]) + 15) & ~15);// 16 bit alignment in preperation for SSE
}


来源:https://stackoverflow.com/questions/16376942/best-cross-platform-method-to-get-aligned-memory

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!