Compiling for Compute Capability 2.x in CUDA C for VS2010

不问归期 提交于 2020-01-07 02:27:06

问题


I was following this: Dynamically allocating memory inside __device/global__ CUDA kernel

But it still doesn't compile.

error : calling a host function("_malloc_dbg") from a __device__/__global__  
function("kernel") is not allowed

error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA  
\v4.1\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\"  
--use-local-env --cl-version 2010 -ccbin "c:\Program Files (x86)\Microsoft Visual  
Studio 10.0\VC\bin\x86_amd64" -I"..\..\..\Source\Include" -G0  --keep-dir   
"x64\Debug" -maxrregcount=0  --machine 64 --compile  -g  -Xcompiler "/EHsc /nologo 
/Od /Zi  /MDd " -o "x64\Debug\move.cu.obj"  "C:\Source\scene\move.cu"" exited with  
code 2. C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\BuildCustomizations\CUDA  
4.1.targets     361 10  

As suggested, I added #if __CUDA_ARCH__ >= 200 and it returns false.

What else can be the issues? I'm running on a GTX480.

Edit: I have this warning as well: #warning C4005: '_malloca' : macro redefinition


回答1:


I understand you solved your main problem but there is the remaining question:

I added #if __CUDA_ARCH__ >= 200 and it returns false.

The CUDA code is compiled at least twice. In one compilation pass the CPU code is generated, in another pass, the device code. __CUDA_ARCH__ is defined only for the device code generation. It is possible to make even more compilation passes and produce GPU code for several architectures. The code for CPU would not change, but the GPU will.

I suspect that you are testing the #if __CUDA_ARCH__ >= 200 when producing CPU code.



来源:https://stackoverflow.com/questions/9056183/compiling-for-compute-capability-2-x-in-cuda-c-for-vs2010

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!