Is it possible to leave the return value of a thrust::reduce operation in device-allocated memory? In case it is, is it just as easy as assigning the value to a cudaMalloc\'
Yes, it should be possible by using thrust::reduce_by_key instead with a thrust::constant_iterator supplied for the keys.