YUV to RGBA on Apple A4, should I use shaders or NEON?

最后都变了- 提交于 2019-11-29 20:08:43

It is most definitely worthwhile learning OpenGL ES2.0 shaders:

  1. You can load-balance between the GPU and CPU (e.g. video decoding of subsequent frames while GPU renders the current frame).
  2. Video frames need to go to the GPU in any case: using YCbCr saves you 25% bus bandwidth if your video has 4:2:0 sampled chrominance.
  3. You get 4:2:0 chrominance up-sampling for free, with the GPU hardware interpolator. (Your shader should be configured to use the same vertex coordinates for both Y and C{b,r} textures, in effect stretching the chrominance texture out over the same area.)
  4. On iOS5 pushing YCbCr textures to the GPU is fast (no data-copy or swizzling) with the texture cache (see the CVOpenGLESTextureCache* API functions). You will save 1-2 data-copies compared to NEON.

I am using these techniques to great effect in my super-fast iPhone camera app, SnappyCam.

You are on the right track for implementation: use a GL_LUMINANCE texture for Y and GL_LUMINANCE_ALPHA if your CbCr is interleaved. Otherwise use three GL_LUMINANCE textures if all of your YCbCr components are noninterleaved.

Creating two textures for 4:2:0 bi-planar YCbCr (where CbCr is interleaved) is straightforward:

    glBindTexture(GL_TEXTURE_2D, texture_y);
    glTexImage2D(
        GL_TEXTURE_2D, 
        0, 
        GL_LUMINANCE,        // Texture format (8bit)
        width,
        height,
        0,                   // No border
        GL_LUMINANCE,        // Source format (8bit)
        GL_UNSIGNED_BYTE,    // Source data format
        NULL
    );
    glBindTexture(GL_TEXTURE_2D, texture_cbcr);
    glTexImage2D(
        GL_TEXTURE_2D, 
        0, 
        GL_LUMINANCE_ALPHA, // Texture format (16-bit)
        width / 2,
        height / 2,
        0,                  // No border
        GL_LUMINANCE_ALPHA, // Source format (16-bits)
        GL_UNSIGNED_BYTE,   // Source data format
        NULL
    );

where you would then use glTexSubImage2D() or the iOS5 texture cache to update these textures.

I'd also recommend using a 2D varying that spans the texture coordinate space (x: [0,1], y: [0,1]) so that you avoid any dependent texture reads in your fragment shader. The end result is super-fast and doesn't load the GPU at all in my experience.

Converting YUV to RGB using NEON is very slow. Use a shader to offload onto the GPU.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!