Bilinear filter with SSE4.1 intrinsics

南楼画角 提交于 2019-12-03 14:42:10
David McPaul

Nothing specific to say about your code. But I wrote my own Bilinear scaling code using SSE2. See the StackOverflow question Help me improve some more SSE2 code for more details.

In my code I calculate the horizontal and vertical fractions and indexes first rather than per pixel. I think this is faster.

My code under core2 cpus seems to be memory limited rather than cpu so not doing the precalc might be faster.

Noticed your comment "TODO: Should this be an arithmetic or logical shift or does it matter?"

Arithmetic shift is for signed integers. Logical shift is for unsigned integers.

    0x80000000 >> 4 is 0xf8000000 // Arithmetic shift
    0x80000000 >> 4 is 0x08000000 // Logical shift
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!