Why does GCC generate 15-20

后端 未结 6 2182
天涯浪人
天涯浪人 2020-11-28 17:17

I first noticed in 2009 that GCC (at least on my projects and on my machines) have the tendency to generate noticeably faster code if I optimize for size (<

6条回答
  •  陌清茗
    陌清茗 (楼主)
    2020-11-28 17:26

    I'm by no means an expert in this area, but I seem to remember that modern processors are quite sensitive when it comes to branch prediction. The algorithms used to predict the branches are (or at least were back in the days I wrote assembler code) based on several properties of the code, including the distance of a target and on the direction.

    The scenario which comes to mind is small loops. When the branch was going backwards and the distance was not too far, the branch predicition was optimizing for this case as all the small loops are done this way. The same rules might come into play when you swap the location of add and work in the generated code or when the position of both slightly changes.

    That said, I have no idea how to verify that and I just wanted to let you know that this might be something you want to look into.

提交回复
热议问题