thrust

how to calculate an average from a int2 array using Thrust

半世苍凉 提交于 2019-12-22 06:49:40
问题 I'm trying to calculate the average of a certain array which contains points (x,y). is it possible to use thrust to find the average point represented as a (x,y) point? i could also represent the array as a thrust::device_vector<int> when each cell contains the absolute position of the point, meaning i*numColumns + j though I'm not sure that the average number represents the average cell. Thanks! 回答1: #include <iostream> #include <thrust/device_vector.h> #include <thrust/reduce.h> struct add

how fast is thrust::sort and what is the fastest radix sort implementation

牧云@^-^@ 提交于 2019-12-21 22:11:18
问题 I'm a newbie to GPU programming. Recently, I'm trying to implement the gpu bvh construction algorithm based on an tutorial: http://devblogs.nvidia.com/parallelforall/thinking-parallel-part-iii-tree-construction-gpu/. In the first step of this algorithm, the morton code(unsigned int) of every primitive is computed and sorted. The tutorial gives a reference time cost of the morton code computing and sorting for 12K objects: 0.02 ms, one thread per object: Calculate bounding box and assign

CUB (CUDA UnBound) equivalent of thrust::gather

别等时光非礼了梦想. 提交于 2019-12-21 17:40:07
问题 Due to some performance issues with the Thrust libraries (see this page for more details), I am planning on re-factoring a CUDA application to use CUB instead of Thrust. Specifically, to replace the thrust::sort_by_key and thrust::inclusive_scan calls). In a particular point in my application I need to sort 3 arrays by key. This is how I did this with thrust: thrust::sort_by_key(key_iter, key_iter + numKeys, indices); thrust::gather_wrapper(indices, indices + numKeys, thrust::make_zip

Operating on thrust::complex types with thrust::transform

你说的曾经没有我的故事 提交于 2019-12-20 07:28:05
问题 I'm trying to use thrust::transform to operate on vectors of type thrust:complex<float> without success. The following example blows up during compilation with several pages of errors. #include <cuda.h> #include <cuda_runtime.h> #include <cufft.h> #include <thrust/device_vector.h> #include <thrust/host_vector.h> #include <thrust/transform.h> #include <thrust/complex.h> int main(int argc, char *argv[]) { thrust::device_vector< thrust::complex<float> > d_vec1(4); thrust::device_vector<float> d

How to pass an array of vectors to cuda kernel?

走远了吗. 提交于 2019-12-20 06:15:33
问题 I now have thrust::device_vector<int> A[N]; and my kernel function __global__ void kernel(...) { auto a = A[threadIdx.x]; } I know that via thrust::raw_pointer_cast I could pass a device_vector to kernel. But how could I pass an array of vector to it? 回答1: The really short answer is that you basically can't, and the longer answer is that you really shouldn't even if you discover or are presented with a hacky way of doing this. And in the spirit of that advice, what you can do is something

Retain Duplicates with Set Intersection in CUDA

大兔子大兔子 提交于 2019-12-20 05:29:06
问题 I'm using CUDA and THRUST to perform paired set operations. I would like to retain duplicates , however. For example: int keys[6] = {1, 1, 1, 3, 4, 5, 5}; int vals[6] = {1, 2, 3, 4, 5, 6, 7}; int comp[2] = {1, 5}; thrust::set_intersection_by_key(keys, keys + 6, comp, comp + 2, vals, rk, rv); Desired result rk[1, 1, 1, 5, 5] rv[1, 2, 3, 6, 7] Actual Result rk[1, 5] rv[5, 7] I want all of the vals where the corresponding key is contained in comp . Is there any way to achieve this using thrust,

Thrust reduce not working with non equal input/output types

ぐ巨炮叔叔 提交于 2019-12-20 04:23:05
问题 I'm attempting to reduce the min and max of an array of values using Thrust and I seem to be stuck. Given an array of floats what I would like is to reduce their min and max values in one pass, but using thrust's reduce method I instead get the mother (or at least auntie) of all template compile errors. My original code contains 5 lists of values spread over 2 float4 arrays that I want reduced, but I've boiled it down to this short example. struct ReduceMinMax { __host__ __device__ float2

thrust count occurence [duplicate]

情到浓时终转凉″ 提交于 2019-12-20 04:15:09
问题 This question already has answers here : Closed 7 years ago . Possible Duplicate: Counting occurences of numbers in cuda array is there a way to use thrust or cuda to count occurrence for the duplicates in an array? for example if I have a device vector { 11, 11, 9, 1, 3, 11, 1, 2, 9, 1, 11} I should get 1 :3 2:1 3:1 9:2, 11:4 if thrust cannot do that, How can I use a kernel to do that? Thanks! I am doing concentration calculation. that's why I am asking this question. assume there are 100000

Thrust transform throws error: “bulk_kernel_by_value: an illegal memory access was encountered”

时间秒杀一切 提交于 2019-12-19 04:43:12
问题 I'm rather new to CUDA/Thrust and have a problem with a code snippet. To make it easier I have trimmed it down to the bare minimum. The code is the following: struct functor{ functor(float (*g)(const float&)) : _g{g} {} __host__ __device__ float operator()(const float& x) const { return _g(x); } private: float (*_g)(const float&); }; __host__ __device__ float g(const float& x){return 3*x;} int main(void){ thrust::device_vector<float> X(4,1); thrust::transform(X.begin(), X.end(), X.begin(),

How to pass a vector to the constructor of a thrust-based odeint observer, such that it can be read within the functor

落花浮王杯 提交于 2019-12-19 03:22:17
问题 I am extending the parameter study example from boost's odeint used with thrust, and I do not know how to pass a vector of values to the constructor of the observer, such that those values can be accessed (read-only) from within the observer's functor. The following is the code just for the observer. //// Observes the system, comparing the current state to //// values in unchangingVector struct minimum_perturbation_observer { struct minPerturbFunctor { template< class T > __host__ __device__