OpenCL user defined inline functions

天涯浪子 提交于 2019-11-30 04:13:31
Kayhano

Function used to create program is ...

cl_program clCreateProgramWithSource  (     
    cl_context context,
    cl_uint count,
    const char **strings,
    const size_t *lengths,
    cl_int *errcode_ret)

You can place functions inside the strings parameter like this,

float AddVector(float a, float b)
{
    return a + b;
}

kernel void VectorAdd(
    global read_only float* a,
    global read_only float* b,
    global write_only float* c )
{
    int index = get_global_id(0);
    //c[index] = a[index] + b[index];
    c[index] = AddVector(a[index], b[index]);
}

Now you have one user defined function "AddVector" and a kernel function "VectorAdd"

Based on the code samples here you can just write functions like:

inline int add(int a,int b)
{
   return a+b;
}

(Eg. look at the .cl file in the DXTC or bitonic sort examples.)

I don't know if that's an nvidia only extension but the OpenCL documentation talks about "auxiliary functions" as well as kernels.

Yktula

OpenCL supports auxiliary functions. See page 19 of this link for examples.

I googled around a bit, and just kept coming back to this question :-P

In the end, what I did was use macros, since inlining would be implementation-dependent anyway, and macros don't seem to have any major disadvantage in the context of c99 OpenCL programs? eg:

#define getFilterBoardOffset( filter, inputPlane ) \
    ( ( filter * gInputPlanes + inputPlane ) * gFilterSizeSquared )
#define getResultBoardOffset( n, filter ) \
    ( ( n * gNumFilters + filter ) * gOutputBoardSizeSquared )

instead of:

inline float getFilterBoardOffset( float filter, int inputPlane ) { 
    return ( filter * gInputPlanes + inputPlane ) * gFilterSizeSquared; 
}
inline float getResultBoardOffset( float n, int filter ) { 
    return ( n * gNumFilters + filter ) * gOutputBoardSizeSquared; 
}
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!