Time measuring in PyOpenCL

ぐ巨炮叔叔 提交于 2020-01-23 16:46:26

问题


I am running a kernel using PyOpenCL in a FPGA and in a GPU. In order to measure the time it takes to execute I use:

t1 = time()
event = mykernel(queue, (c_width, c_height), (block_size, block_size), d_c_buf, d_a_buf, d_b_buf, a_width, b_width)
event.wait()
t2 = time()

compute_time = t2-t1
compute_time_e = (event.profile.end-event.profile.start)*1e-9 

This provides me the execution time from the point of view of the host (compute_time) and from the device (compute_time_e). The problem is that this values are very different:

compute (host-timed) [s]: 0.0009386539459228516
compute (event-timed) [s]:  9.4528e-05

Does anyone knows what can be the reason for this differences? And more important, which one is more accurate?

Thank you.


回答1:


Both those numbers look right to me. If I am reading this correctly, the host is measuring about 10x the device time - which is not super strange for a small kernel because it includes transfer time latency. Your host time measures communicating through the PCB but your device time is just measuring an on-chip operation.

I think your program timing breaks down like this:

  • Kernel Execution Time: 0.1ms // event-timed
  • Transfer Time: 0.8ms // (host-timed - event-timed)
  • Total Time: 0.9ms // host-timed

If you are curious about the situation, try running a kernel that takes much longer on the device. You should start see these numbers match up much more closely as the fixed transfer time becomes less of the overall time.

For example:

  • Kernel Execution Time: 900ms
  • Transfer Time: 0.8ms
  • Total Time: 900.8ms



回答2:


You can learn pretty much from Intels site on OpenCL. It states, that event.profile only gives a hint on the pure hardware execution time of the kernel and leaves out the data transfer times (which is included in your first measurement). Therefore the host-side wall-clock time might return different results. However, it is also stated that if you aim the kernel to the CPU as an OpenCL device, the time difference should become lower (or even negligible).



来源:https://stackoverflow.com/questions/49598214/time-measuring-in-pyopencl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!