SSE42 & STTNI - PcmpEstrM is twice slower than PcmpIstrM, is it true?

后端 未结 1 625
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-20 17:51

I\'m experimenting with SSE42 and STTNI instructions and have got strange result - PcmpEstrM (works with explicit length strings) runs twice slower

相关标签:
1条回答
  • 2020-12-20 18:50

    According to the instruction tables of Agner fog, pcmpestrm takes 8 µops, whereas pcmpistrm takes 3 µops on most architectures. This should explain the performance difference you observe. Consider rewriting your code so you can use pcmpistrm instead of pcmpestrm if possible.

    0 讨论(0)
提交回复
热议问题