Allocate 2D Array on Device Memory in CUDA
How do I allocate and transfer(to and from Host) 2D arrays in device memory in Cuda? Gitmo I found a solution to this problem. I didn't have to flatten the array. The inbuilt cudaMallocPitch() function did the job. And I could transfer the array to and from device using cudaMemcpy2D() function. For example cudaMallocPitch((void**) &array, &pitch, a*sizeof(float), b); This creates a 2D array of size a*b with the pitch as passed in as parameter. The following code creates a 2D array and loops over the elements. It compiles readily, you may use it. #include<stdio.h> #include<cuda.h> #define