The best optimization one can get is by revisiting the design, and after profiling the performance relevant parts/algorithms of the application. This is usually not language specific.
What I mean is (just as an idea) if you would get 30% performance improvement by selecting a slightly better algorithm (or collection/container class) the improvement you can expect from a C++ related refactoring would be at most 2%. A design improvement could give you anything above 30%.
If you have a concrete application, the best strategy is to measure and profile the application. Profiling gives usually the most instant idea of which parts are performance relevant.