Effective optimization strategies on modern C++ compilers

后端 未结 19 2079
梦如初夏
梦如初夏 2020-12-22 17:02

I\'m working on scientific code that is very performance-critical. An initial version of the code has been written and tested, and now, with profiler in hand, it\'s time to

19条回答
  •  星月不相逢
    2020-12-22 17:46

    About STL containers.

    Most people here claim STL offers one of the fastest possible implementations of the container algorithms. And I say the opposite: for the most real-world scenarios the STL containers taken as-is yield a really catastrophic performance.

    People argue about the complexity of the algorithms used in STL. Here STL is good: O(1) for list/queue, vector (amortized), and O(log(N)) for map. But this is not the real bottleneck of the performance for a typical application! For many applications the real bottleneck is the heap operations (malloc/free, new/delete, etc.).

    A typical operation on the list costs just a few CPU cycles. On a map - some tens, may be more (this depends on the cache state and log(N) of course). And typical heap operations cost from hunders to thousands (!!!) of CPU cycles. For multithreaded applications for instance they also require synchronization (interlocked operations). Plus on some OSs (such as Windows XP) the heap functions are implemented entirely in the kernel mode.

    So that the actual performance of the STL containers in a typical scenario is dominated by the amount of heap operations they perform. And here they're disastrous. Not because they're implemented poorly, but because of their design. That is, this is the question of the design.

    On the other hand there're other containers which are designed differently. Once I've designed and written such containers for my own needs:

    http://www.codeproject.com/KB/recipes/Containers.aspx

    And it proved for me to be superior from the performance point of view, and not only.

    But recently I've discovered I'm not the only one who thought about this. boost::intrusive is the container library that is implemented in the manner similar to what I did then.

    I suggest you try it (if you didn't already)

提交回复
热议问题