Timing in OpenGL program?

旧时模样 提交于 2020-01-15 11:44:06

问题


I have learned enough OpenGL/GLUT (using PyOpenGL) to come up with a simple program that sets up a fragment shader, draws a full screen quad, and produces frames in sync with the display (shadertoy-style). I also to some degree understand the graphics pipeline.

What I don't understand is how the OpenGL program and the graphics pipeline fit together. In particular, in my GLUT display callback,

# set uniforms
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4)  # draw quad
glutSwapBuffers()

I suppose I activate the vertex shader by giving it vertices through glDrawArrays, which produces fragments (pixels). But then, does the fragment shader kick in immediately after glDrawArrays? There are fragments, so it can do something. On the other hand, it is still possible that there are further draw commands creating further vertices, which can a) produce new fragments and b) overwrite existing fragments.

I profiled the program and found that 99% of the time is spent in glutSwapBuffers. That is of course partially due to waiting for the vertical sync, but it stays that way when I use a very demanding fragment shader which significantly reduces the frame rate. That suggests that the fragment shader is only activated somewhere in glutSwapBuffers. Is that correct?

I understand that the fragment shader is executed on the GPU, not the CPU, but it still appears that the CPU (program) waits until the GPU (shader) is finished, within glutSwapBuffers...


回答1:


I profiled the program and found that 99% of the time is spent in glutSwapBuffers. That is of course partially due to waiting for the vertical sync, but it stays that way when I use a very demanding fragment shader which significantly reduces the frame rate. That suggests that the fragment shader is only activated somewhere in glutSwapBuffers. Is that correct?

No. That logic is completely flawed. The main point here is that the fragment shader runs on the GPU, which works totally asynchronous to the CPU. You are not measuring the fragment shader, you are measuring some implicit CPU-GPU-synchronization - it looks like your implementation syncs on the buffer swap (if too many frames are queued up, probably), so all you measure is the time the CPU has to wait for the GPU. And if you increase the GPU workload without significantly increasing the CPU workload, your CPU will just spend more time waiting.

OpenGL itself does not define any of this, so all the details are ultimately completely implementation-specific. It is just guaranteed by the spec that the implementation will behave as if the fragments were generated in the order in which you draw the primitives (e.g. with blending enabled, the actual order becomes relevant evan ion overdraw scenarios). But at what point the fragments will be generated, and which optimizations might happen in-between vertex processing and invocation of your fragment shader, is totally out of your control. GPUs might employ tile-based rasterization schemes, where the actual fragment shading is delayed a bit (if possible) to improve efficiency and avoid overshading.

Note that most GPU drivers work completely asynchronously. When you call a gl*() command it returns before it has been processed. It might only be queued up for later processing (e.g. in another driver thread), and will ultimately be transformed in some GPU-specific command buffers which are transferred to the GPU. You might end up with implicit CPU-GPU synchronization (or CPU-CPU with a driver thread), for example, when you read back framebuffer data after a draw call, this will imply that all previous GL commands will be flushed for processing, and the CPU will wait for the processing to be done before retrieving the image data - and that is also what makes such readbacks so slow.

As a consequence, any CPU-side measures of OpenGL code are completely meaningless. You need to measure the timing on the GPU, and that's what Timer Queries are for.



来源:https://stackoverflow.com/questions/56136145/timing-in-opengl-program

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!