How to make a kernel function which callable from both the host and device?

前端 未结 2 1861
忘掉有多难
忘掉有多难 2021-01-13 15:58

The following trial presents my intention, which failed to compile:

__host__ __device__ void f(){}

int main()
{
    f<<<1,1>>>();
}
         


        
2条回答
  •  情歌与酒
    2021-01-13 16:17

    You need to create a CUDA kernel entry point, e.g. __global__ function. Something like:

    #include 
    
    __host__ __device__ void f() {
    #ifdef __CUDA_ARCH__
        printf ("Device Thread %d\n", threadIdx.x);
    #else
        printf ("Host code!\n");
    #endif
    }
    
    __global__ void kernel() {
       f();
    }
    
    int main() {
       kernel<<<1,1>>>();
       if (cudaDeviceSynchronize() != cudaSuccess) {
           fprintf (stderr, "Cuda call failed\n");
       }
       f();
       return 0;
    }
    

提交回复
热议问题