Displaying exception debug information to users

社会主义新天地 提交于 2019-11-28 18:29:48

Wrapping all your code in one try/catch block is a-ok. It won't slow down the execution of anything inside it, for example. In fact, all my programs have (code similar to) this framework:

int execute(int pArgc, char *pArgv[])
{
    // do stuff
}

int main(int pArgc, char *pArgv[])
{
    // maybe setup some debug stuff,
    // like splitting cerr to log.txt

    try
    {
        return execute(pArgc, pArgv);
    }
    catch (const std::exception& e)
    {
        std::cerr << "Unhandled exception:\n" << e.what() << std::endl;
        // or other methods of displaying an error

        return EXIT_FAILURE;
    }
    catch (...)
    {
        std::cerr << "Unknown exception!" << std::endl;

        return EXIT_FAILURE;
    }
}

Putting a try/catch block in main() is okay, it doesn't cause any problems. The program is dead on an unhandled exception anyway. It isn't going to be helpful at all in your quest to get the all-important stack trace though. That info is gonzo when the catch block traps the exception.

Catching a C++ exception won't be very helpful either. The odds that the program dies on a an exception derived from std::exception are pretty slim. Although it could happen. Much more likely in a C/C++ app is death due to hardware exceptions, AccessViolation being numero uno. Trapping those requires the __try and __except keywords in your main() method. Again, very little context is available, you've basically only got an exception code. An AV also tells you which exact memory location caused the exception.

This is not just a cross-platform issue btw, you can't get a good stack trace on any platform. There is no reliable way to walk the stack, there are too many optimizations (like framepointer omission) that make this a perilous journey. It is the C/C++ way: make it as fast as possible, leave no clue what happened when it blows up.

What you need to do is debug these kind of problems the C/C++ way. You need to create a minidump. It is roughly analogous to the "core dump" of old, a snapshot of the process image at the time the exception happens. Back then, you actually got a complete dump of the core. There's been progress, nowadays it is "mini", somewhat necessary because a complete core dump would take close to 2 gigabytes. It actually works pretty well to diagnose the program state.

On Windows, that starts by calling SetUnhandledExceptionFilter(), you provide a callback function pointer to a function that will run when your program dies on an unhandled exception. Any exception, C++ as well as SEH. Your next resource is dbghelp.dll, available in the Debugging Tools for Windows download. It has an entrypoint called MiniDumpWriteDump(), it creates a minidump.

Once you get the file created by MiniDumpWriteDump(), you're pretty golden. You can load the .dmp file in Visual Studio, almost like it's a project. Press F5 and VS grinds away for a while trying to load .pdb files for the DLLs loaded in the process. You'll want to setup the symbol server, that's very important to get good stack traces. If everything works, you'll get a "debug break" at the exact location where the exception was thrown". With a stack trace.

Things you need to do to make this work smoothly:

  • Use a build server to create the binaries. It needs to push the debugging symbols (.pdb files) to a symbol server so they are readily available when you debug the minidump.
  • Configure the debugger so it can find the debugging symbols for all modules. You can get the debugging symbols for Windows from Microsoft, the symbols for your code needs to come from the symbol server mentioned above.
  • Write the code to trap the unhandled exception and create the minidump. I mentioned SetUnhandledExceptionFilter() but the code that creates the minidump should not be in the program that crashed. The odds that it can write the minidump successfully are fairly slim, the state of the program is undetermined. Best thing to do is to run a "guard" process that keeps an eye on a named Mutex. Your exception filter can set the mutex, the guard can create the minidump.
  • Create a way for the minidump to get transferred from the client's machine to yours. We use Amazon's S3 service for that, terabytes at a reasonable rate.
  • Wire the minidump handler into your debug database. We use Jira, it has a web-service that allows us to verify the crash bucket against a database of earlier crashes with the same "signature". When it is unique, or doesn't have enough hits, we ask the crash manager code to upload the minidump to Amazon and create the bug database entry.

Well, that's what I did for the company I work for. Worked out very well, it reduced crash bucket frequency from thousands to dozens. Personal message to the creators of the open source ffdshow component: I hate you with a passion. But you're no longer crashing our app anymore! Buggers.

No it's not stupid. It's a very good idea, and it costs virtually nothing at runtime until you hit an unhandled exception, of course.

Be aware that there is already an exception handler wrapping your thread, provided by the OS (and another one by the C-runtime I think). You may need to pass certain exceptions on to these handlers to get correct behavior. In some architectures, accessing mis-aligned data is handled by an exception handler. so you may want to special case EXCEPTION_DATATYPE_MISALIGNMENT and let it pass on to the higher level exception handler.

I include the registers, the app version and build number, the exception type and a stack dump in hex annotated with module names and offsets for hex values that could be addresses to code. Be sure to include the version number and build number/date of your exe.

You can also use VirtualQuery to turn stack values into "ModuleName+Offset" pretty easily. And that, combined with a .MAP file will often tell you exactly where you crashed.

I found that I could train beta testers to send my the text pretty easily, but in the early days what I got was a picture of the error dialog rather than the text. I think that's because a lot of users don't know you can right click on any Edit control to get a menu with "Select All" and "Copy". If I was going to do it again, I would add a button that copied that text to the clipboard so that it can easily be pasted into an email.

Even better if you want to go to the trouble of haveing a 'send error report' button, but just giving users a way to get the text into their own emails gets you most of the way there, and doesn't raise any red flags about "what information am I sharing with them?"

In fact, boost::diagnostic_information has been designed specifically to be used in a "global" catch(...) block, to display information about exceptions which should not have reached it. However, note that the string returned by boost::diagnostic_information is NOT user-friendly.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!