问题
I have a couple of structures that summed up exceed the 256 bytes size allowed to be passed as parameters in a kernel call.
Both structures are already allocated and copied to device global memory.
1) How can I make use in the same kernel of these structures without being passed as parameters?
More details. Separately, these structures can be passed as parameters. For example, in different kernels. But:
2) How can I use both structures in the same kernel?
回答1:
As Robert Crovella suggested in his comment, you should just be able to pass a pointer to those areas. I have had similar problem in opencl.. This is how I implemented the struct:
(My kernel and host functions are in opencl, syntax can be the issue for you..but the context is same.!)
Following two are defined in my 'Mapper.c'--> Host function
typedef struct data
{
double dattr[10];
int d_id;
int bestCent;
}Data;
typedef struct cent
{
double cattr[5];
int c_id;
}Cent;
Data *dataNode;
Cent *centNode;
After allocating memory on Device's global memory, I transferred the data. I had to redefine the struct definitions in my other kernel function as below:
mapper.cl:
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
typedef struct data
{
double dattr[10];
int d_id;
int bestCent;
}Data;
typedef struct cent
{
double cattr[5];
int c_id;
}Cent;
__kernel void mapper(__global int *keyMobj, __global int *valueMobj,__global Data *dataMobj,__global Cent *centMobj)
{
int i= get_global_id(0);
int j,k,color=0;
double dmin=1000000.0, dx;
for(j=0; j<2; j++) //here 2 is number of centroids considered
{
dx = 0.0;
for(k=0; k<2; k++)
{
dx+= ((centMobj[j].cattr[k])-(dataMobj[i].dattr[k])) * ((centMobj[j].cattr[k])-(dataMobj[i].dattr[k]));
}
if(dx<dmin)
{ color = j;
dmin = dx;
}
}
keyMobj[i] = color;
valueMobj[i] = dataMobj[i].d_id;
}
You can see that I have passed only pointer to those areas.. i.e. keyMobj and valueMobj.
kernel = clCreateKernel(program, "mapper", &ret);
ret = clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)&keyMobj);
ret = clSetKernelArg(kernel, 1, sizeof(cl_mem), (void *)&valueMobj);
ret = clSetKernelArg(kernel, 2, sizeof(cl_mem), (void *)&dataMobj);
ret = clSetKernelArg(kernel, 3, sizeof(cl_mem), (void *)¢Mobj);
Above lines of code is belongs to host side function(mapper.c) which creates kernel function(mapper.cl)..and next 4 lines (clSetKernelArg..) passes the arguments to the kernel function.
回答2:
If your data structures are already in global memory, then you can just pass a pointer in as the kernel argument.
On a related note, the limit for kernel arguments is 4KB for devices of compute capability 2.x and higher:
global function parameters are passed to the device:
- via shared memory and are limited to 256 bytes on devices of compute capability 1.x,
- via constant memory and are limited to 4 KB on devices of compute capability 2.x and higher.
device and global functions cannot have a variable number of arguments.
(c.f. http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#function-parameters)
来源:https://stackoverflow.com/questions/21895167/ideas-for-cuda-kernel-calls-with-parameters-exceeding-256-bytes