OpenCL: Correct results on CPU not on GPU: how to manage memory correctly?
问题 __kernel void CKmix(__global short* MCL, __global short* MPCL,__global short *C, int S, int B) { unsigned int i=get_global_id(0); unsigned int ii=get_global_id(1); MCL[i]+=MPCL[B*ii+i+C[ii]+S]; } Kernel seams ok, it compiles successfully, and I have obtained the correct results using the CPU as a device, but that was when I had the program release and and recreate my memory objects each time the kernel is called, which for my testing purpose is about 16000 times. The code I am posting is