thrust reduction result on device memory

余生长醉 提交于 2019-12-18 06:56:16

问题


Is it possible to leave the return value of a thrust::reduce operation in device-allocated memory? In case it is, is it just as easy as assigning the value to a cudaMalloc'ed area, or should I use a thrust::device_ptr?


回答1:


Is it possible to leave the return value of a thrust::reduce operation in device-allocated memory?

The short answer is no.

thrust reduce returns a quantity, the result of the reduction. This quantity must be deposited in a host resident variable:

Take for example reduce, which is synchronous and always returns its result to the CPU:

template<typename Iterator, typename T> 
T reduce(Iterator first, Iterator last, T init); 

Once the result of the operation has been returned to the CPU, you can copy it to the GPU if you like:

#include <iostream>
#include <thrust/device_vector.h>
#include <thrust/reduce.h>

int main(){

    thrust::device_vector<int> data(256, 1);
    thrust::device_vector<int> result(1);
    result[0] = thrust::reduce(data.begin(), data.end());
    std::cout << "result = " << result[0] << std::endl;
    return 0;
}

Another possible alternative is to use thrust::reduce_by_key which will return the reduction result to device memory, rather than copy to host memory. If you use a single key for your entire array, the net result will be a single output, similar to thrust::reduce




回答2:


Yes, it should be possible by using thrust::reduce_by_key instead with a thrust::constant_iterator supplied for the keys.



来源:https://stackoverflow.com/questions/21761412/thrust-reduction-result-on-device-memory

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!