OpenGL-OpenCL interop transfer times + texturing from bitmap

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-07 04:11:16

问题


Two part question:

I'm working on a school project using the game of life as a vehicle to experiment with gpgpu. I'm using OpenCL and OpenGL for realtime visualizations and the goal is to get this thing as big and fast as possible. Upon profiling I find that the frame time is dominated by CL Acquiring and Releasing the GL buffers, and that the time cost is directly proportional to the actual size of the buffer.

1) Is this normal? Why should this be? To the best of my understanding, the buffer never leaves device memory, and the CL Acquire/Release acts like a mutex. Does OpenCL lock/unlock each byte individually or something?

To get around this I've shrunk from 24-bit RGBA color mode (OpenGL's preferred color mode as I understand it?) to 8-bit RGB color. This has resulted in a major speedup, but after tuning my kernel, the transfer times are dominating again.

In the absence of any ideas on how to eliminate the transfer times entirely (short of porting my kernel from OpenCL to GLSL, which would exceed the original scope of the project), I now figure that my best bet is to write to a bitmap (as opposed to the 8-bit pixmap I'm currently using) and then use that bitmap with a color index to texture a quad.

2) Can I texture a quad directly using a bitmap? I have considered using glBitmap to draw to an auxiliary buffer, and then using this buffer to texture my quad, but I would prefer to use a more direct route if one is available.


回答1:


The design intent behind the CL/GL interop acquire and release calls was for them to be simply ownership transfers. However, in many early implementations these were doing copies of the images from CL to GL and back.

Unless you use the sync object extensions in OpenCL 1.1, you need to clFinish before you release and glFinish before you acquire; you will see a lot of time spent here because all queued work will have to finish before these calls continue. Some platforms you can use clFlush instead of clFinish; check the OpenCL documentation from your vendor.

With the latest NVIDIA and AMD drivers on more or less recent hardware, I'm seeing the acquire and release calls going pretty quickly for HD video sized images.



来源:https://stackoverflow.com/questions/13716855/opengl-opencl-interop-transfer-times-texturing-from-bitmap

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!