CUDA ptxas warnings (Stack size for entry)

故事扮演 提交于 2019-12-10 18:24:41

问题


I am getting the following warning which I dont understand when compiling CUDA code:

CUDACOMPILE : ptxas warning : Stack size for entry function '_Z24gpu_kernel_get_3d_pointsiPK8RtmPointS1_PKfS3_P10RtmPoint3DPif' cannot be statically determined

The kernel prototype is:

__global__ void gpu_kernel_get_3d_points(int count1, const RtmPoint *pPoints1, const RtmPoint *pPoints2, const float *PL, const float *PR, 
RtmPoint3D *pPoints3D, int *pGlobalCount, float bbox)

All the pointers are pointers to device memory. I dont see why the compiler should have a problem determing the stack size. There are a few local variables in the kernel but not many. Any ideas? Does this warning matter?


回答1:


It seems like your kernel is dynamically allocating memory on the GPU heap using malloc() or the new operator. It may have an adverse effect on your kernel's performance.




回答2:


This warning happens when a function is recursive. Cuda tries to allocate the stack space before the execution which is not a big deal, unless you are using recursion. The problem with it is that the stack size isn't predictible. The depth of the recursion isn't a known value so the memory that the stack will use isn't known. This warning isn't really relevant but if you exceed the GPU stack with your data, you must manually increase the stack size.



来源:https://stackoverflow.com/questions/9950599/cuda-ptxas-warnings-stack-size-for-entry

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!