Optimizations for pow() with const non-integer exponent?

前端 未结 10 654
旧时难觅i
旧时难觅i 2020-12-04 09:17

I have hot spots in my code where I\'m doing pow() taking up around 10-20% of my execution time.

My input to pow(x,y) is very specific, so

10条回答
  •  生来不讨喜
    2020-12-04 09:42

    This might not answer your question.

    The 2.4f and 1/2.4f make me very suspicious, because those are exactly the powers used to convert between sRGB and a linear RGB color space. So you might actually be trying to optimize that, specifically. I don't know, which is why this might not answer your question.

    If this is the case, try using a lookup table. Something like:

    __attribute__((aligned(64))
    static const unsigned short SRGB_TO_LINEAR[256] = { ... };
    __attribute__((aligned(64))
    static const unsigned short LINEAR_TO_SRGB[256] = { ... };
    
    void apply_lut(const unsigned short lut[256], unsigned char *src, ...
    

    If you are using 16-bit data, change as appropriate. I would make the table 16 bits anyway so you can dither the result if necessary when working with 8-bit data. This obviously won't work very well if your data is floating point to begin with -- but it doesn't really make sense to store sRGB data in floating point, so you might as well convert to 16-bit / 8-bit first and then do the conversion from linear to sRGB.

    (The reason sRGB doesn't make sense as floating point is that HDR should be linear, and sRGB is only convenient for storing on disk or displaying on screen, but not convenient for manipulation.)

提交回复
热议问题