SSE42 & STTNI - PcmpEstrM is twice slower than PcmpIstrM, is it true?

送分小仙女□ 提交于 2019-12-01 16:36:42

According to the instruction tables of Agner fog, pcmpestrm takes 8 µops, whereas pcmpistrm takes 3 µops on most architectures. This should explain the performance difference you observe. Consider rewriting your code so you can use pcmpistrm instead of pcmpestrm if possible.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!