How Can a Stack Trace Point to the Wrong Line (the “return” Statement) - 40 Lines Off

后端 未结 4 667
面向向阳花
面向向阳花 2020-12-08 15:27

I have twice now seen a NullReferenceException logged from a Production ASP.NET MVC 4 web application - and logged on the wrong line. Not wrong by a line or two

4条回答
  •  谎友^
    谎友^ (楼主)
    2020-12-08 16:09

    I have seen this kind of behavior in production code once. Although the details are a little vague (It was approximately 2 years ago, and although I can find the email, I don't have access to the code anymore, and neither the dumps etc.)

    FYI, this is what I wrote to the team (very small parts from the large mail) -

    // Code at TeamProvider.cs:line 34
    Team securedTeam = TeamProvider.GetTeamByPath(teamPath); // Static method call.
    

    "No way null reference exception can happen here."

    Later, after more dump diving

    "Findings -

    1. The issue was happening in DBI because it did not have root/BRH team. UI is not handling the null returned by CLib gracefully, and hence the exception.
    2. The stack trace shown on UI was misleading, and was due to the fact that Jitter and CPU can optimize / reorder instructions, causing stack traces to "lie".

    Digging into a process dump revealed the issue, and has been confirmed that DBI indeed did not have above mentioned team."


    I think, the thing to note here is the statement in bold above, in contrast with your analysis and statement -

    "I just looked at the decompiled source, and it does not appear to have been rearranged.", or

    "Production build running on my local machine shows the correct line number."

    The idea is that optimizations can happen at different levels.. and those done at compile time are just some of them. Today, especially with Managed environment like .Net, there are actually relatively fewer optimizations done while emitting IL (Why should 10 compilers for 10 different .Net languages try to do same set of optimizations, when the emitted Intermediate Language code will be further transformed into machine code, either by ngen, or Jitter).

    Hence what you have observed, can only be confirmed by looking at the jitted machine code (aka assembly) from a dump from the production machine.


    One question I can see coming is - Why would Jitter emit different code on Production machine, compared to your machine, for the same build?

    Answer - I don't know. I am not a Jit expert, but I do believe it can... because as I said above.. Today these things are way more sophisticated compared to technologies used 5-10 years ago. Who knows, what all factors.. like "memory, number of CPUs, CPU load, 32 bit vs 64 bit, Numa vs Non-Numa, Number of times a method has executed, how small or big a method is, Who calls it, what does it call, how many times, access patterns for memory locations, etc etc" does it look at while making these optimizations.

    For your case, so far only you can reproduce it, and only you have access to your jitted code in production. Hence, (If I may say so :)) this is the best answer anyone can come up with.


    EDIT: An important difference between the Jitter on one machine vs the other, can also be the version of the jitter itself. I'd imagine that as several patches and KBs are released for .net framework, who knows what differences of optimization behavior jitter with even minor version differences may have.

    In other words, it is not sufficient to assume that both machines have same Major version of the framework (lets say .Net 4.5 SP1). Production may not have patches which are released every day, but your dev / private machine may have patch released last Tuesday.


    EDIT 2: Proof of concept - i.e. Jitter optimizations can lead to lying stack traces.

    Run following code yourself, Release build, x64, Optimizations on, all TRACE and DEBUG turned off, Visual Studio Hosting Process turned off. Compile from visual studio, but run from explorer. And try to guess which line the stack trace will tell you the exception is at?

    class Program
    {
        static void Main(string[] args)
        {
            string bar = ReturnMeNull();
    
            for (int i = 0; i < 100; i++)
            {
                Console.WriteLine(i);
            }
    
            for (int i = 0; i < bar.Length; i++)
            {
                Console.WriteLine(i);
            }
    
            Console.ReadLine();
    
            return;
        }
    
        [MethodImpl(MethodImplOptions.NoInlining)]
        static string ReturnMeNull()
        {
            return null;
        }
    }
    

    Unfortunately, after few attempts, I still cannot reproduce exact problem you have seen (i.e. error on return statement), because only you have access to exact code, and any specific code pattern it may have. Or, once again, it is some other Jitter optimization, which is not documented, and hence hard to guess.

提交回复
热议问题