Hey there, I have the following piece of code:
#if USE_CONST == 1
__constant__ double PNT[ SIZE ];
#else
__device__ double *PNT;
#endif
>
The correct usage of cudaMemcpyToSymbol prior to CUDA 4.0 is:
cudaMemcpyToSymbol("PNT", point, sizeof(double)*SIZE)
or alternatively:
double *cpnt;
cudaGetSymbolAddress((void **)&cpnt, "PNT");
cudaMemcpy(cpnt, point, sizeof(double)*SIZE, cudaMemcpyHostToDevice);
which might be a bit faster if you are planning to access the symbol from the host API more than once.
EDIT: misunderstood the question. For the global memory version, do something similar to the second version for constant memory
double *gpnt;
cudaGetSymbolAddress((void **)&gpnt, "PNT");
cudaMemcpy(gpnt, point, sizeof(double)*SIZE. cudaMemcpyHostToDevice););