Is just-in-time (jit) compilation of a CUDA kernel possible?

我是研究僧i 提交于 2019-12-21 17:53:52

问题


Does CUDA support JIT compilation of a CUDA kernel?

I know that OpenCL offers this feature.

I have some variables which are not changed during runtime (i.e. only depend on the input file), therefore I would like to define these values with a macro at kernel compile time (i.e at runtime).

If I define these values manually at compile time my register usage drops from 53 to 46, what greatly improves performance.


回答1:


If it is feasible for you to use Python, you can use the excellent pycuda module to compile your kernels at runtime. Combined with a templating engine such as Mako, you will have a very powerful meta-programming environment that will allow you to dynamically tune your kernels for whatever architecture and specific device properties happen to be available to you (obviously some things will be difficult to make fully dynamic and automatic).

You could also consider just maintaining a few distinct versions of your kernel with different parameters, between which your program could choose at runtime based on whatever input you are feeding to it.




回答2:


It became available with nvrtc library of cuda 7.0. By this library you can compile your cuda codes during runtime.

http://devblogs.nvidia.com/parallelforall/cuda-7-release-candidate-feature-overview/

Bu what kind of advantages you can gain? In my view, i couldn't find so much dramatic advantages of dynamic compilation.



来源:https://stackoverflow.com/questions/13567123/is-just-in-time-jit-compilation-of-a-cuda-kernel-possible

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!