Multi GPU usage with CUDA Thrust

空扰寡人 提交于 2019-12-05 22:41:59

The problem here is that you are trying to perform a device to device of copy data between a pair of device_vector which reside in different GPU contexts (because of the cudaSetDevice call). What you have perhaps overlooked is that this sequence of operations:

thrust::host_vector<float> hvConscience(1024);
vRes.push_back(hvConscience);

is performing a copy from hvConscience at each loop iteration. The thrust backend is expecting that source and destination memory lie in the same GPU context. In this case they do not, thus the error.

What you probably want to do is work with a vector of pointers to device_vector instead, so something like:

typedef thrust::device_vector< float > vec;
typedef vec *p_vec;
std::vector< p_vec > vRes;

unsigned int iDeviceCount   = GetCudaDeviceCount();
for(unsigned int i = 0; i < iDeviceCount; i++) {
    cudaSetDevice(i); 
    p_vec hvConscience = new vec(1024);
    vRes.push_back(hvConscience);
}

[disclaimer: code written in browser, neither compiled nor tested, us at own risk]

This way you are only creating each vector once, in the correct GPU context, and then copy assigning a host pointer, which doesn't trigger any device side copies across memory spaces.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!