GCC generated assembly for unaligned float access on ARM

陌路散爱 提交于 2021-02-20 17:56:41

问题


Hello I am currently working on a program where I need to process a data blob that contains a series of floats which could be unaligned (and also are sometimes). I am compiling with gcc 4.6.2 for an ARM cortex-a8. I have a question to the generated assembly code:

As example I wrote a minimal example: For the following test code

float aligned[2];
float *unaligned = (float*)(((char*)aligned)+2);

int main(int argc, char **argv) 
{
    float f = unaligned[0];  
    return (int)f;
}

the compiler (gcc 4.6.2 - with optimization -O3) produces

00008634 <main>:
    8634: e30038ec            movw         r3, #2284      ; 0x8ec
    8638: e3403001            movt         r3, #1
    863c: e5933000            ldr          r3, [r3]
    8640: edd37a00            vldr         s15, [r3]
    8644: eefd7ae7            vcvt.s32.f32 s15, s15
    8648: ee170a90            vmov         r0, s15
    864c: e12fff1e            bx           lr

The compiler here cannot know if the data is aligned but never the less it uses VLDR which needs aligned data or the program will crash with a bus error.

Now here is my actual question: Is this correct from the compiler and I need to take care of alignment in my C++ code or is this a bug in the compiler?

I also might add my current workaround which works and brings gcc to make a copy before accessing the value. The trick is to define a struct which only contains a float with the gcc packed attribute and access the data via a struct pointer. Code snippet:

struct FloatWrapper { float f; } __attribute__((packed));
const FloatWrapper *x = reinterpret_cast<const FloatWrapper *>(rawX.data());
const FloatWrapper *y = reinterpret_cast<const FloatWrapper *>(rawY.data());

for (size_t i = 0; i < vertexCount; ++i) {
    vertices[i].x = x[i].f;
    vertices[i].y = y[i].f;
}

回答1:


As you have pointed ARM ARM A3.2.1 states regardless of SCTLR.A value, VLDR generates Alignment fault.

I've tested your example on an Cortex-A9 and I got

# float_align                                                   
[1] + Stopped (signal)     float_align 

However, I'm confused also by the ARM Cortex-A8 TRM 4.2.1, it states

If an alignment qualifier is not specified, and A=1, the alignment fault is taken if it is not aligned to element size.

If an alignment qualifier is not specified, and A=0, it is treated as unaligned access.

This is probably a half baked explanation, since ARM ARM is giving more information with a detailed table on instructions.

So I think answer is, you need to take care of alignment yourself since compiler can't find out which addresses you are loading in all scenarios, like address might be available after linking etc.



来源:https://stackoverflow.com/questions/17184731/gcc-generated-assembly-for-unaligned-float-access-on-arm

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!