GCC - How to realign stack?

无人久伴 提交于 2019-12-03 08:15:49

Allocate on the stack an array that is 15-bytes larger than sizeof(__m128), and use the first aligned address in that array. If you need several, allocate them in an array with a single 15-byte margin for alignment.

I do not remember if allocating an unsigned char array makes you safe from strict aliasing optimizations by the compiler or if it only works only the other way round.

#include <stdint.h>

void *f(void *x)
{
   unsigned char y[sizeof(__m128)+15];
   __m128 *py = (__m128*) (((uintptr_t)&y) + 15) & ~(uintptr_t)15);
   ...
}

This shouldn't be happening in the first place, but to work around the problem you can try:

void *f(void *x)
{
   __m128 y __attribute__ ((aligned (16)));
   ...
}

Another solution would be, to use a padding function, which first aligns the stack and then calls f. So instead of calling f directly, you call pad, which pads the stack first and then calls foowith an aligned stack.

The code would look like this:

#include <xmmintrin.h>
#include <pthread.h>

#define ALIGNMENT 16

void *f(void *x) {
    __m128 y;
    // other stuff
}

void * pad(void *val) {
    unsigned int x; // to get the current address from the stack
    unsigned char pad[ALIGNMENT - ((unsigned int) &x) % ALIGNMENT];
    return f(val);
}

int main(void){
    pthread_t p;
    pthread_create(&p, NULL, pad, NULL);
}

Sorry to resurrect an old thread...

For those with a newer compiler than OP, OP mentions a -mstackrealign option, which lead me to __attribute__((force_align_arg_pointer)). If your function is being optimized to use SSE, but %ebp is misaligned, this will do the runtime fixes if required for you, transparently. I also found out that this is only an issue on i386. The x86_64 ABI guarantees the arguments are aligned to 16 bytes.

__attribute__((force_align_arg_pointer)) void i_crash_when_not_aligned_to_16_bytes() { ... }

Cool article for those who might want to learn more: http://wiki.osdev.org/System_V_ABI

I have solved this problem. Here is my solution:

void another_function(){
   __m128 y;
   ...
}
void *f(void *x){
asm("pushl    %esp");
asm("subl    $16,%esp");
asm("andl    $-0x10,%esp");
another_function();
asm("popl %esp");
}

First, we increase the stack by 16 bytes. Second, we make least-significant nibble equal 0x0. We preserve the stack pointer using push/pop operands. We call another function, which has all its own local variables 16-byte aligned. All nested functions will also have their local variables 16-byte aligned.

And It works!

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!