I\'m writing a performance-critical, number-crunching C++ project where 70% of the time is used by the 200 line core module.
I\'d like to optimize the core using inl
I prefer writing entire functions in assembly rather than using inline assembly. This allows you to swap out the high level language function with the assembly one during the build process. Also, you don't have to worry about compiler optimizations getting in the way.
Before you write a single line of assembly, print out the assembly language listing for your function. This gives you a foundation to build upon or modify. Another helpful tool is the interweaving of assembly with source code. This will tell you how the compiler is coding specific statements.
If you need to insert inline assembly for a large function, make a new function for the code that you need to inline. Again replace with C++ or assembly during build time.
These are my suggestions, Your Mileage May Vary (YMMV).