Coding Practices which enable the compiler/optimizer to make a faster program

后端 未结 30 1946
一个人的身影
一个人的身影 2020-12-02 03:24

Many years ago, C compilers were not particularly smart. As a workaround K&R invented the register keyword, to hint to the compiler, that maybe it woul

30条回答
  •  醉话见心
    2020-12-02 04:24

    Two coding technics I didn't saw in the above list:

    Bypass linker by writing code as an unique source

    While separate compilation is really nice for compiling time, it is very bad when you speak of optimization. Basically the compiler can't optimize beyond compilation unit, that is linker reserved domain.

    But if you design well your program you can can also compile it through an unique common source. That is instead of compiling unit1.c and unit2.c then link both objects, compile all.c that merely #include unit1.c and unit2.c. Thus you will benefit from all the compiler optimizations.

    It's very like writing headers only programs in C++ (and even easier to do in C).

    This technique is easy enough if you write your program to enable it from the beginning, but you must also be aware it change part of C semantic and you can meet some problems like static variables or macro collision. For most programs it's easy enough to overcome the small problems that occurs. Also be aware that compiling as an unique source is way slower and may takes huge amount of memory (usually not a problem with modern systems).

    Using this simple technique I happened to make some programs I wrote ten times faster!

    Like the register keyword, this trick could also become obsolete soon. Optimizing through linker begin to be supported by compilers gcc: Link time optimization.

    Separate atomic tasks in loops

    This one is more tricky. It's about interaction between algorithm design and the way optimizer manage cache and register allocation. Quite often programs have to loop over some data structure and for each item perform some actions. Quite often the actions performed can be splitted between two logically independent tasks. If that is the case you can write exactly the same program with two loops on the same boundary performing exactly one task. In some case writing it this way can be faster than the unique loop (details are more complex, but an explanation can be that with the simple task case all variables can be kept in processor registers and with the more complex one it's not possible and some registers must be written to memory and read back later and the cost is higher than additional flow control).

    Be careful with this one (profile performances using this trick or not) as like using register it may as well give lesser performances than improved ones.

提交回复
热议问题