std::vector to array in CUDA

杀马特。学长 韩版系。学妹 提交于 2019-12-09 01:39:39

问题


Is there a way to convert a 2D vector into an array to be able to use it in CUDA kernels?

It is declared as:

vector<vector<int>> information;

I want to cudaMalloc and copy from host to device, what would be the best way to do it?

int *d_information;
cudaMalloc((void**)&d_information, sizeof(int)*size);
cudaMemcpy(d_information, information, sizeof(int)*size, cudaMemcpyHostToDevice);

回答1:


In a word, no there isn't. The CUDA API doesn't support deep copying and also doesn't know anything about std::vector either. If you insist on having a vector of vectors as a host source, it will require doing something like this:

int *d_information;
cudaMalloc((void**)&d_information, sizeof(int)*size);

int *dst = d_information;
for (std::vector<std::vector<int> >::iterator it = information.begin() ; it != information.end(); ++it) {
    int *src = &((*it)[0]);
    size _t sz = it->size();

    cudaMemcpy(dst, src, sizeof(int)*sz, cudaMemcpyHostToDevice);
    dst += sz;
}

[disclaimer: written in browser, not compiled or tested. Use at own risk]

This would copy the host memory to an allocation in GPU linear memory, requiring one copy for each vector. If the vector of vectors is a "jagged" array, you will want to store an indexing somewhere for the GPU to use as well.




回答2:


As far as I understand, the vector of vectors do not need to reside in a contiguous memory, i.e. they can be fragmented.

Depending on the amount of memory you need to transfer I would do one of two issues:

  1. Reorder your memory to be a single vector, and then use your cudaMemcpy.
  2. Create a series of cudaMemcpyAsync, where each copy handles a single vector in your vector of vectors, and then synchronize.


来源:https://stackoverflow.com/questions/17570399/stdvector-to-array-in-cuda

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!