How can I correctly unpack a V210 video frame using GLSL?

I have 10-bit YUV (V210) video frames coming in from a capture card, and I would like to unpack this data inside of a GLSL shader and ultimately convert to RGB for screen output. I'm using a Quadro 4000 card on Linux (OpenGL 4.3).

I am uploading the texture with the following settings:

video frame: 720x486 pixels

physically occupies 933120 bytes in 128-byte aligned memory (stride of 1920)

texture is currently uploaded as 480x486 pixels (stride/4 x height) since this matches the byte count of the data

internalFormat of GL_RGB10_A2

format of GL_RGBA

type of GL_UNSIGNED_INT_2_10_10_10_REV

filtering is currently set to GL_NEAREST

Here is the upload command for clarity:

int stride = ((m_videoWidth + 47) / 48) * 128;

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB10_A2, stride / 4, m_videoHeight, 0, GL_RGBA, GL_UNSIGNED_INT_2_10_10_10_REV, bytes);

The data itself is packed like so:

U Y V A | Y U Y A | V Y U A | Y V Y A

Or see Blackmagic's illustration here: http://i.imgur.com/PtXBJbS.png

Each texel is 32-bits total (10 bits each for "R,G,B" channels and 2 bits for alpha). Where it gets complicated is that 6 pixels are packed into this block of 128 bits. These blocks simply repeat the above pattern until the end of the frame.

I know that the components of each texel can be accessed with texture2D(tex, coord).rgb but since the order is not the same for every texel (e.g. UYV vs YUY), I know that the texture coordinates must be manipulated to account for that.

However, I'm not sure how to deal with the fact that there are simply more pixels packed into this texture than the GL knows about, which I believe means that I have to account for scaling up/down as well as min/mag filtering (I need bilinear) internally in my shader. The output window needs to be able to be any size (smaller, same or larger than the texture) so the shader should not have any constants related to that.

How can I accomplish this?

I reccommend to first write a shader that only does the pixel reordering, and keep interpolation out.

It does require extra video RAM, and another render pass, but it is not neccesarily slower: If you include rescaling, you need to compute the contents of 4 intermediate pixels and then interpolate between them. If you make a separate shader for the interpolation, it can be as simple as returning the result of a single texture lookup with hardware interpolation.

Once you have a correct shader for the color sample rearrangement, you can always turn it into a function.

So then how to write the rearrangement shader?

Your inputs look like this:

  U Y V A | Y U Y A | V Y U A | Y V Y A

Let's for simplicity assume you only want to read Y. Then you can make a very simple 1d texture (720 columns x 1 rows). Each texture cell will have two values: the column offset where to read the value from. Second we need the location of the Y sample within that cell:

  U Y V A | Y U Y A | V Y U A | Y V Y A


  0      1      2      3      4      5   ..... // out column (0-720)

  0      1      1      2      3      3   ..... // source column index (0-480)
  1      0      2      1      0      2   ..... // Y sample index in column (range 0-2)

To get the Y(Luminance) value, you index the row texture with screen x position. Then you know which source texel to read. Then take the second component, and use it to get the correct sample. In DirectX, you can simply index a vec4/float4 with a integer to select the R/G/B/A value. I hope GLSL supports the same.

So now you have Y. Repeat the above process for U and V.

Once you get it working, you can try to optimize by smartly packing the above information more effectively in a single texture, rather than three different ones. Or, you can try to think of a linear function, that after rounding produces the column indexes. That will save you many texture lookups.

But maybe the whole optimization is moot point in your scenario. Just get the simplest case working first.

I purposely didn't write the shader code for you as I am mostly familiar with DirectX. But this should get you started.

Good luck!

Here is the completed shader with all channels and RGB conversion (no filtering is performed however):

#version 130
#extension GL_EXT_gpu_shader4 : enable
in vec2 texcoord;
uniform mediump sampler2D tex;
out mediump vec4 color;

// YUV offset
const vec3 yuvOffset = vec3(-0.0625, -0.5, -0.5);

// RGB coefficients
// BT.601 colorspace
const vec3 Rcoeff = vec3(1.1643,  0.000,  1.5958);
const vec3 Gcoeff = vec3(1.1643, -0.39173, -0.81290);
const vec3 Bcoeff = vec3(1.1643,  2.017,  0.000);

// U Y V A | Y U Y A | V Y U A | Y V Y A

int GROUP_FOR_INDEX(int i) {
  return i / 4;
}

int SUBINDEX_FOR_INDEX(int i) {
  return i % 4;
}

int _y(int i) {
  return 2 * i + 1;
}

int _u(int i) {
  return 4 * (i/2);
}

int _v(int i) {
  return 4 * (i / 2) + 2;
}

int offset(int i) {
  return i + (i / 3);
}

vec3 ycbcr2rgb(vec3 yuvToConvert) {
  vec3 pix;
  yuvToConvert += yuvOffset;
  pix.r = dot(yuvToConvert, Rcoeff);
  pix.g = dot(yuvToConvert, Gcoeff);
  pix.b = dot(yuvToConvert, Bcoeff);
  return pix;
}

void main(void) {
  ivec2 size = textureSize2D(tex, 0).xy; // 480x486
  ivec2 sizeOrig = ivec2(size.x * 1.5, size.y); // 720x486

  // interpolate 0,0 -> 1,1 texcoords to 0,0 -> 720,486
  ivec2 texcoordDenorm = ivec2(texcoord * sizeOrig);

  // 0 1 1 2 3 3 4 5 5 6 7 7 etc.
  int yOffset = offset(_y(texcoordDenorm.x));
  int sourceColumnIndexY = GROUP_FOR_INDEX(yOffset);

  // 0 0 1 1 2 2 4 4 5 5 6 6 etc.
  int uOffset = offset(_u(texcoordDenorm.x));
  int sourceColumnIndexU = GROUP_FOR_INDEX(uOffset);

  // 0 0 2 2 3 3 4 4 6 6 7 7 etc.
  int vOffset = offset(_v(texcoordDenorm.x));
  int sourceColumnIndexV = GROUP_FOR_INDEX(vOffset);

  // 1 0 2 1 0 2 1 0 2 etc.
  int compY = SUBINDEX_FOR_INDEX(yOffset);

  // 0 0 1 1 2 2 0 0 1 1 2 2 etc.
  int compU = SUBINDEX_FOR_INDEX(uOffset);

  // 2 2 0 0 1 1 2 2 0 0 1 1 etc.
  int compV = SUBINDEX_FOR_INDEX(vOffset);

  vec4 y = texelFetch(tex, ivec2(sourceColumnIndexY, texcoordDenorm.y), 0);
  vec4 u = texelFetch(tex, ivec2(sourceColumnIndexU, texcoordDenorm.y), 0);
  vec4 v = texelFetch(tex, ivec2(sourceColumnIndexV, texcoordDenorm.y), 0);

  vec3 outColor = ycbcr2rgb(vec3(y[compY], u[compU], v[compV]));

  color = vec4(outColor, 1.0);
}

If the image is going to be scaled up on the screen then you will likely want to do bilinear filtering, but this will need to be performed within the shader.

来源：https://stackoverflow.com/questions/20317882/how-can-i-correctly-unpack-a-v210-video-frame-using-glsl

标签

opengl

video

glsl

video-capture