CUDA function pointers

前端 未结 2 1971
误落风尘
误落风尘 2020-11-30 14:56

I was trying to make somtehing like this (actually I need to write some integration functions) in CUDA

#include 
using namespace std;

float         


        
相关标签:
2条回答
  • 2020-11-30 15:35

    To get rid of your compile error, you'll have to use -gencode arch=compute_20,code=sm_20 as a compiler argument when compiling your code. But then you'll likely have some runtime problems:

    Taken from the CUDA Programming Guide http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#functions

    Function pointers to __global__ functions are supported in host code, but not in device code. Function pointers to __device__ functions are only supported in device code compiled for devices of compute capability 2.x and higher.

    It is not allowed to take the address of a __device__ function in host code.

    so you can have something like this (adapted from the "FunctionPointers" sample):

    //your function pointer type - returns unsigned char, takes parameters of type unsigned char and float
    typedef unsigned char(*pointFunction_t)(unsigned char, float);
    
    //some device function to be pointed to
    __device__ unsigned char
    Threshold(unsigned char in, float thresh)
    {
       ...
    }
    
    //pComputeThreshold is a device-side function pointer to your __device__ function
    __device__ pointFunction_t pComputeThreshold = Threshold;
    //the host-side function pointer to your __device__ function
    pointFunction_t h_pointFunction;
    
    //in host code: copy the function pointers to their host equivalent
    cudaMemcpyFromSymbol(&h_pointFunction, pComputeThreshold, sizeof(pointFunction_t))
    

    You can then pass the h_pointFunction as a parameter to your kernel, which can use it to call your __device__ function.

    //your kernel taking your __device__ function pointer as a parameter
    __global__ void kernel(pointFunction_t pPointOperation)
    {
        unsigned char tmp;
        ...
        tmp = (*pPointOperation)(tmp, 150.0)
        ...
    }
    
    //invoke the kernel in host code, passing in your host-side __device__ function pointer
    kernel<<<...>>>(h_pointFunction);
    

    Hopefully that made some sense. In all, it looks like you would have to change your f1 function to be a __device__ function and follow a similar procedure (the typedefs aren't necessary, but they do make the code nicer) to get it as a valid function pointer on the host-side to pass to your kernel. I'd also advise giving the FunctionPointers CUDA sample a look over

    0 讨论(0)
  • 2020-11-30 15:47

    Even though you may be able to compile this code (see @Robert Crovella's answer) this code will not work. You cannot pass function pointers from host code as the host compiler has no way of figuring out the function address.

    0 讨论(0)
提交回复
热议问题