VC++ SSE code generation - is this a compiler bug?

房东的猫 提交于 2020-01-14 09:11:56

问题


A very particular code sequence in VC++ generated the following instruction (for Win32):

unpcklpd    xmm0,xmmword ptr [ebp-40h]

2 questions arise:

(1) As far as I understand the intel manual, unpcklpd accepts as 2nd argument a 128-aligned memory address. If the address is relative to a stack frame alignment cannot be forced. Is this really a compiler bug?

(2) Exceptions are thrown from at the execution of this instruction only when run from the debugger, and even then not always. Even attaching to the process and executing this code does not throw. How can this be??

The particular exception thrown is access violation at 0xFFFFFFFF, but AFAIK that's just a code for misalignment.


[Edit:] Here's some source that demonstrates the bad code generation - but typically doesn't cause a crash. (that's mostly what I'm wondering about)

[Edit 2:] The code sample now reproduces the actual crash. This one also crashes outside the debugger - I suspect the difference occurs because the debugger launches the program at different typical base addresses.

    // mock.cpp
    #include <stdio.h>
    struct mockVect2d
    {
        double x, y;
        mockVect2d()    {}
        mockVect2d(double a, double b) : x(a), y(b) {}
        mockVect2d operator + (const mockVect2d& u) {
            return mockVect2d(x + u.x, y + u.y);
        }
    };

    struct MockPoly
    {
        MockPoly() {}
        mockVect2d*    m_Vrts;
        double  m_Area;
        int     m_Convex;
        bool    m_ParClear;

        void ClearPar()  { m_Area = -1.; m_Convex = 0; m_ParClear = true; }

        MockPoly(int len) { m_Vrts = new mockVect2d[len]; }

        mockVect2d& Vrt(int i) {
            if (!m_ParClear) ClearPar();
            return m_Vrts[i];
        }

        const mockVect2d& GetCenter() { return m_Vrts[0]; }
    };


    struct MockItem
    {
        MockItem() : Contour(1) {}
        MockPoly Contour;
    };

    struct Mock
    {
        Mock() {}
        MockItem m_item;
        virtual int GetCount()                  { return 2; }
        virtual mockVect2d GetCenter()  { return mockVect2d(1.0, 2.0); }
        virtual MockItem GetItem(int i) { return m_item; }
    };

    void testInner(int a)
    {
        int c = 8;
        printf("%d", c);
        Mock* pMock = new Mock;
        int Flag = true;
        int nlr = pMock->GetCount();

        if (nlr == 0)
            return;

        int flr = 1;
        if (flr == nlr)
            return;

        if (Flag)
        {
            if (flr < nlr && flr>0) {
                int c = 8;
                printf("%d", c);

                MockPoly pol(2);
                mockVect2d ctr = pMock->GetItem(0).Contour.GetCenter();

                // The mess happens here:
                //          ; 74   :            pol.Vrt(1) = ctr + mockVect2d(0., 1.0);
                // 
                //          call ? Vrt@MockPoly@@QAEAAUmockVect2d@@H@Z; MockPoly::Vrt
                //              movdqa  xmm0, XMMWORD PTR $T4[ebp]
                //              unpcklpd xmm0, QWORD PTR tv190[ebp]      **** crash!
                //              movdqu  XMMWORD PTR[eax], xmm0

                pol.Vrt(0) = ctr + mockVect2d(1.0, 0.);
                pol.Vrt(1) = ctr + mockVect2d(0., 1.0);
            }
        }
    }

    void main()
    {
        testInner(2);
        return;
    }

If you prefer, download a ready vcxproj with all the switches set from here. This includes the complete ASM too.


回答1:


Update: this is now a confirmed VC++ compiler bug, hopefully to be resolved in VS2015 RTM.


Edit: The connect report, like many others, is now garbage. However the compiler bug seems to be resolved in VS2017 - not in 2015 update 3.




回答2:


Since no one else has stepped up, I'm going to take a shot.

1) If the address is relative to a stack frame alignment cannot be forced. Is this really a compiler bug?

I'm not sure it is true that you cannot force alignment for stack variables. Consider this code:

struct foo
{
    char a;
    int b;
    unsigned long long c;
};

int wmain(int argc, wchar_t* argv[])
{
    foo moo;
    moo.a = 1;
    moo.b = 2;
    moo.c = 3;
}

Looking at the startup code for main, we see:

00E31AB0  push        ebp  
00E31AB1  mov         ebp,esp  
00E31AB3  sub         esp,0DCh  
00E31AB9  push        ebx  
00E31ABA  push        esi  
00E31ABB  push        edi  
00E31ABC  lea         edi,[ebp-0DCh]  
00E31AC2  mov         ecx,37h  
00E31AC7  mov         eax,0CCCCCCCCh  
00E31ACC  rep stos    dword ptr es:[edi]  
00E31ACE  mov         eax,dword ptr [___security_cookie (0E440CCh)]  
00E31AD3  xor         eax,ebp  
00E31AD5  mov         dword ptr [ebp-4],eax  

Adding __declspec(align(16)) to moo gives

01291AB0  push        ebx  
01291AB1  mov         ebx,esp  
01291AB3  sub         esp,8  
01291AB6  and         esp,0FFFFFFF0h  <------------------------
01291AB9  add         esp,4  
01291ABC  push        ebp  
01291ABD  mov         ebp,dword ptr [ebx+4]  
01291AC0  mov         dword ptr [esp+4],ebp  
01291AC4  mov         ebp,esp  
01291AC6  sub         esp,0E8h  
01291ACC  push        esi  
01291ACD  push        edi  
01291ACE  lea         edi,[ebp-0E8h]  
01291AD4  mov         ecx,3Ah  
01291AD9  mov         eax,0CCCCCCCCh  
01291ADE  rep stos    dword ptr es:[edi]  
01291AE0  mov         eax,dword ptr [___security_cookie (12A40CCh)]  
01291AE5  xor         eax,ebp  
01291AE7  mov         dword ptr [ebp-4],eax  

Apparently the compiler (VS2010 compiled debug for Win32), recognizing that we will need specific alignments for the code, takes steps to ensure it can provide that.

2) Exceptions are thrown from at the execution of this instruction only when run from the debugger, and even then not always. Even attaching to the process and executing this code does not throw. How can this be??

So, a couple of thoughts:

  • "and even then not always" - Not standing over your shoulder when you run this, I can't say for certain. However it seems plausible that just by random chance, stacks could get created with the alignment you need. By default, x86 uses 4byte stack alignment. If you need 16 byte alignment, you've got a 1 in 4 shot.

  • As for the rest (from https://msdn.microsoft.com/en-us/library/aa290049%28v=vs.71%29.aspx#ia64alignment_topic4):

On the x86 architecture, the operating system does not make the alignment fault visible to the application. ...you will also suffer performance degradation on the alignment fault, but it will be significantly less severe than on the Itanium, because the hardware will make the multiple accesses of memory to retrieve the unaligned data.

TLDR: Using __declspec(align(16)) should give you the alignment you want, even for stack variables. For unaligned accesses, the OS will catch the exception and handle it for you (at a cost of performance).

Edit1: Responding to the first 2 comments below:

Based on MS's docs, you are correct about the alignment of stack parameters, but they propose a solution as well:

You cannot specify alignment for function parameters. When data that has an alignment attribute is passed by value on the stack, its alignment is controlled by the calling convention. If data alignment is important in the called function, copy the parameter into correctly aligned memory before use.

Neither your sample on Microsoft connect nor the code about produce the same code for me (I'm only on vs2010), so I can't test this. But given this code from your sample:

struct mockVect2d
{
    double x, y;
    mockVect2d(double a, double b) : x(a), y(b) {}

It would seem that aligning either mockVect2d or the 2 doubles might help.



来源:https://stackoverflow.com/questions/28981458/vc-sse-code-generation-is-this-a-compiler-bug

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!