CUDA : How to copy a 3D array from host to device?

坚强是说给别人听的谎言 提交于 2019-12-01 13:06:00

问题


I want to learn how can i copy a 3 dimensional array from host memory to device memory. Lets say i have a 3d array which contains data. For example int host_data[256][256][256]; I want to copy that data to dev_data (a device array) in such a way so host_data[x][y][z]=dev_data[x][y][z]; How can i do it? and how am i supposed to access the dev_data array in the device? A simple example would be very helpfull.


回答1:


The common way is to flatten an array (make it one-dimensional). Then you'll have to make some calculations to map from (x,y,z) triple to one number - a position in a flattened one-dimensional array.

Example 2D:

int data[256][256];
int *flattened = data;
data[x][y] == fattened[x * 256 + y];

Example 3D:

int data[256][256][256];
int *flattened = data;
data[x][y][z] == flattened[x * 256 * 256 + y * 256 + z];

or use a wrapper:

__host__ __device___ inline int index(const int x, const int y, const int z) {
     return x * 256 * 256 + y * 256 + z;
}

Knowing that, you can allocate a linear array with cudaMalloc, as usual, then use an index function to access corresponding element in device code.

Update: The author of this question claims to have found a better solution (at least for 2D), you might want to have a look.




回答2:


For fixed dimensions (e.g. [256][256][256]) let the compiler do the work for you and follow this example. This is attractive because we need only do a single cudaMalloc/cudaMemcpy to tranfer the data, using a single pointer. If you must have variable dimensions, it's better to think about alternate ways to handle this due to the complexity, but you may wish to look at this example (referring to the second example code that I posted). Please be advised that this method is considerably more complicated and hard to follow. I recommend not using it if you can avoid it.

Edit: If you're willing to flatten your array, the answer provided by @Ixanezis is recommended, and is commonly used. My answer is based on the assumption that you really want to access the array using 3 subscripts both on the host and device. As pointed out in the other answer, however, you can simulate 3 subscript access using a macro or function to calculate offsets into a 1-D array.



来源:https://stackoverflow.com/questions/15799086/cuda-how-to-copy-a-3d-array-from-host-to-device

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!