How to align pointer

前端 未结 6 2365
说谎
说谎 2020-12-01 04:29

How do I align a pointer to a 16 byte boundary?

I found this code, not sure if its correct

char* p= malloc(1024);

if ((((unsigned long) p) % 16) !=          


        
6条回答
  •  悲哀的现实
    2020-12-01 05:18

    Updated: New Faster Algorithm

    Don't use modulo because it takes hundreds of clock cycles on x86 due to the nasty division and a lot more on other systems. I came up with a faster version of std::align than GCC and Visual-C++. Visual-C++ has the slowest implementation, which actually uses an amateurish conditional statement. GCC is very similar to my algorithm but I did the opposite of what they did but my algorithm is 13.3 % faster because it has 13 as opposed to 15 single-cycle instructions. See here is the research paper with dissassembly. The algorithm is actually one instruction faster if you use the mask instead of the pow_2.

    /* Quickly aligns the given pointer to a power of two boundaries.
    @return An aligned pointer of typename T.
    @desc Algorithm is a 2's compliment trick that works by masking off
    the desired number in 2's compliment and adding them to the
    pointer. Please note how I took the horizontal comment whitespace back.
    @param pointer The pointer to align.
    @param mask Mask for the lower LSb, which is one less than the power of 
    2 you wish to align too. */
    template 
    inline T* AlignUp(void* pointer, uintptr_t mask) {
      intptr_t value = reinterpret_cast(pointer);
      value += (-value) & mask;
      return reinterpret_cast(value);
    }
    

    Here is how you call it:

    enum { kSize = 256 };
    char buffer[kSize + 16];
    char* aligned_to_16_byte_boundary = AlignUp<> (buffer, 15); //< 16 - 1 = 15
    char16_t* aligned_to_64_byte_boundary = AlignUp (buffer, 63);
    

    Here is the quick bit-wise proof for 3 bits, it works the same for all bit counts:

    ~000 = 111 => 000 + 111 + 1 = 0x1000
    ~001 = 110 => 001 + 110 + 1 = 0x1000
    ~010 = 101 => 010 + 101 + 1 = 0x1000
    ~011 = 100 => 011 + 100 + 1 = 0x1000
    ~100 = 011 => 100 + 011 + 1 = 0x1000
    ~101 = 010 => 101 + 010 + 1 = 0x1000
    ~110 = 001 => 110 + 001 + 1 = 0x1000
    ~111 = 000 => 111 + 000 + 1 = 0x1000
    

    Just in case you're here to learn how to align to a cache line an object in C++11, use the in-place constructor:

    struct Foo { Foo () {} };
    Foo* foo = new (AlignUp (buffer, 63)) Foo ();
    

    Here is the std::align implmentation, it uses 24 instructions where the GCC implementation uses 31 instructions, though it can be tweaked to eliminate a decrement instruction by turning (--align) to the mask for the Least Significant bits but that would not operate functionally identical to std::align.

    inline void* align(size_t align, size_t size, void*& ptr,
                       size_t& space) noexcept {
       intptr_t int_ptr = reinterpret_cast(ptr),
               offset = (-int_ptr) & (--align);
      if ((space -= offset) < size) {
        space += offset;
        return nullptr;
      }
      return reinterpret_cast(int_ptr + offset);
    }
    

    Faster to Use mask rather than pow_2

    Here is the code for aligning using a mask rather than the the pow_2 (which is the even power of 2). This is 20% fatert than the GCC algorithm but requires you to store the mask rather than the pow_2 so it's not interchangable.

    inline void* AlignMask(size_t mask, size_t size, void*& ptr,
                       size_t& space) noexcept {
       intptr_t int_ptr = reinterpret_cast(ptr),
               offset = (-int_ptr) & mask;
      if ((space -= offset) < size) {
        space += offset;
        return nullptr;
      }
      return reinterpret_cast(int_ptr + offset);
    }
    

提交回复
热议问题