How to properly link cuda header file with device functions?

ε祈祈猫儿з 提交于 2019-11-30 18:05:58

问题


I'm trying to decouple my code a bit and something fails. Compilation error:

error: calling a __host__ function("DecoupledCallGpu") from a __global__ function("kernel") is not allowed

Code excerpt:

main.c (has a call to cuda host function):

#include "cuda_compuations.h"
...
ComputeSomething(&var1,&var2);
...

cuda_computations.cu (has kernel, host master functions and includes header which has device unctions):

#include "cuda_computations.h"
#include "decoupled_functions.cuh"
...
__global__ void kernel(){
...
DecoupledCallGpu(&var_kernel);
}

void ComputeSomething(int *var1, int *var2){
//allocate memory and etc..
...
kernel<<<20,512>>>();
//cleanup
...
}

decoupled_functions.cuh:

#ifndef _DECOUPLEDFUNCTIONS_H_
#define _DECOUPLEDFUNCTIONS_H_

void DecoupledCallGpu(int *var);

#endif

decoupled_functions.cu:

#include "decoupled_functions.cuh"

__device__ void DecoupledCallGpu(int *var){
  *var=0;
}

#endif

Compilation:

nvcc -g --ptxas-options=-v -arch=sm_30 -c cuda_computations.cu -o cuda_computations.o -lcudart

Question: why is it that the DecoupledCallGpu is called from host function and not a kernel as it was supposed to?

P.S.: I can share the actual code behind it if you need me to.


回答1:


Add the __device__ decorator to the prototype in decoupled_functions.cuh. That should take care of the error message you are seeing.

Then you'll need to use separate compilation and linking amongst your modules. So instead of compiling with -c compile with -dc. And your link command will need to be modified. A basic example is here.

Your question is a bit confusing:

Question: why is it that the DecoupledCallGpu is called from host function and not a kernel as it was supposed to?

I can't tell if you're tripping over english or if there is a misunderstanding here. The actual error message states:

error: calling a __host__ function("DecoupledCallGpu") from a __global__ function("kernel") is not allowed

This is arising due to the fact that within the compilation unit (ie. within the module, within the file that is being compiled, ie. cuda_computations.cu), the only description of the function DecoupledCallGpu() is that which is provided in the prototype in the header:

void DecoupledCallGpu(int *var);

This prototype indicates an undecorated function in CUDA C, and such functions are equivalent to __host__ (only) decorated functions:

__host__ void DecoupledCallGpu(int *var);

That compilation unit has no knowledge of what is actually in decoupled_functions.cu.

Therefore, when you have kernel code like this:

__global__ void kernel(){       //<- __global__ function
...
DecoupledCallGpu(&var_kernel);  //<- appears as a __host__ function to compiler
}

the compiler thinks you are trying to call a __host__ function from a __global__ function, which is illegal.



来源:https://stackoverflow.com/questions/24459495/how-to-properly-link-cuda-header-file-with-device-functions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!