I\'m writing a performance-critical, number-crunching C++ project where 70% of the time is used by the 200 line core module.
I\'d like to optimize the core using inl
Go for the low hanging fruit first...
As other have said, the Microsoft compiler is pretty poor at optimisation. You may be able to save yourself a lot of effort just by investing in a decent compiler, such as Intel's ICC, and re-compiling the code "as is". You can get a 30 day free evaluation license from Intel and try it out.
Also, if you have the option to build a 64-bit executable, then running in 64-bit mode can yield a 30% performance improvement, due to the x2 increase in number of available registers.