发表新帖

发表新帖

Floating Point Math Execution Time

前端未结

关注

 2  967

礼貌的吻别 2020-12-19 18:18

What accounts for the added execution time of the first data set? The assembly instructions are the same.

With DN_FLUSH flag not on, the first data set takes 63 m

2条回答

粉色の甜心 (楼主)

2020-12-19 18:28

Quoting from Intel's optimization manual:

When an input operand for a SIMD floating-point instruction [here this includes scalar arithmetic done using SSE] contains values that are less than the representable range of the data type, a denormal exception occurs. This causes a significant performance penalty. An SIMD floating-point operation has a flush-to-zero mode in which the results will not underflow. Therefore subsequent computation will not face the performance penalty of handling denormal input operands.

As for how to avoid this, if you can't flush denormals: do what you can to make sure your data is scaled appropriately and you don't encounter denormals in the first place. Usually this means delaying applying some scale factor until you've finished all of your other computation.

Alternatively, do your computations in double which has a much larger exponent range, and therefore makes it much less likely that you will encounter denormals in the first place.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

热议问题