I am confused about the difference between the intended use of device pointers and cudaArray structures. Could someone please explain why I would use one versus the
cudaArray is an opaque block of memory that is optimized for binding to textures. Textures can use memory stored in a space filling curve, which allows for a better texture cache hit rate due to better 2D spatial locality. Copying data to a cudaArray will cause it to be formatted to such a curve.
So, storing data in a cudaArray is an optimization technique which can yield better texture cache hit rates. On early CUDA architectures, the cudaArray also cannot be accessed by a kernel. However, architectures of compute capability >= 2.0 can access the array via CUDA surfaces.
Determining if you should use a cudaArray or a regular buffer in global memory comes down to the intended usage and access patterns for the memory. It will be project specific.
cudaMallocArray() actually allocates a 2D array, so I think the issue is just inconsistent naming. Maybe it would have been more logical to call it cudaMallocArray2D().
I haven't used 3D textures. Hopefully, someone will answer and let us know why there's no need for cudaBindTexture3D().
You can use cudaBindTextureToArray, it works for both 2D and 3D.