thrust | 易学教程

proper thrust call for subtraction

阅读更多关于 proper thrust call for subtraction

问题 Following from here. Assuming that dev_X is a vector. int * X = (int*) malloc( ThreadsPerBlockX * BlocksPerGridX * sizeof(*X) ); for ( int i = 0; i < ThreadsPerBlockX * BlocksPerGridX; i++ ) X[ i ] = i; // create device vectors thrust::device_vector<int> dev_X ( ThreadsPerBlockX * BlocksPerGridX ); //copy to device thrust::copy( X , X + theThreadsPerBlockX * theBlocksPerGridX , dev_X.begin() ); The following is making a subtraction: thrust::transform( dev_Kx.begin(), dev_Kx.end(), dev_X.begin

CUDA Thrust and sort_by_key

阅读更多关于 CUDA Thrust and sort_by_key

问题 I’m looking for a sorting algorithm on CUDA that can sort an array A of elements (double) and returns an array of keys B for that array A. I know the sort_by_key function in the Thrust library but I want my array of elements A to remain unchanged. What can I do? My code is: void sortCUDA(double V[], int P[], int N) { real_t *Vcpy = (double*) malloc(N*sizeof(double)); memcpy(Vcpy,V,N*sizeof(double)); thrust::sort_by_key(V, V + N, P); free(Vcpy); } i'm comparing the thrust algorithm against

Cuda thrust - xutility: name followed by “::” must be a class or namespace

阅读更多关于 Cuda thrust - xutility: name followed by “::” must be a class or namespace

问题 I would like to use thrust reduction in my CUDA application. Hence I include the header and call the function: #include <thrust\reduce.h> __host__ void reduction(){ unsigned int t = 0; thrust::reduce(t,t); } However I get compile errors (only one type): "name followed by "::" must be a class or namespace". The problem is with a file called xutility (which i haven't touched). All errors are related to the follow class definition: // TEMPLATE CLASS iterator_traits template<class _Iter> struct

How to search the value from a std::map when I use cuda?

阅读更多关于 How to search the value from a std::map when I use cuda?

问题 I have something stored in std::map, which maps string to vector. Its keys and values looks like key value "a"-----[1,2,3] "b"-----[8,100] "cde"----[7,10] For each thread, it needs to process one query. The query looks like ["a", "b"] or ["cde", "a"] So I need to get the value from the map and then do some other jobs like combine them. So as for the first query, the result will be [1,2,3,8,100] The problem is, how can threads access the map and find the value by a key? At first, I tried to

Thrust - How to use my array/data - model

阅读更多关于 Thrust - How to use my array/data - model

问题 I am new to thrust (cuda) and I want to do some array operations but I don´t find any similar example on the internet. I have following two arrays (2d): a = { {1, 2, 3}, {4} } b = { {5}, {6, 7} } I want that thrust compute this array: c = { {1, 2, 3, 5}, {1, 2, 3, 6, 7}, {1, 2, 3, 5}, {1, 2, 3, 6, 7} } I know how it works in c/c++ but not how to say thrust to do it. Here is my idea how it wohl maybe could work: Thread 1: Take a[0] -> expand it with b. Write it to c. Thread 2: Take a[1] ->

Splicing two different length vectors based on their respective index vectors containing global addresses to new vectors of common length with thrust

阅读更多关于 Splicing two different length vectors based on their respective index vectors containing global addresses to new vectors of common length with thrust

问题 this problem has been on my mind for several years. I have been learning a great deal of c++ and cuda from this forum. Previously I wrote the following in fortran serial code with a lot of conditional statements, and using gotos because I could not find a clever way to do it. Here is the problem. Given 4 vectors: int indx(nshape); float dnx(nshape); /* nshape > nord */ int indy(nord); float dny(nord); indx and indy are index vectors (keys for values dnx, dny respectively) containing global

Simple Thrust code performs about half as fast as my naive cuda kernel. Am I using Thrust wrong?

阅读更多关于 Simple Thrust code performs about half as fast as my naive cuda kernel. Am I using Thrust wrong?

问题 I'm pretty new to Cuda and Thrust, but my impression was that Thrust, when used well, is supposed to offer better performance than naively written Cuda kernels. Am I using Thrust in a sub-optimal way? Below is a complete, minimal example that takes an array u of length N+2 , and for each i between 1 and N computes the average 0.5*(u[i-1] + u[i+1]) and puts the result in uNew[i] . ( uNew[0] is set to u[0] and u[N+1] is set to u[N+1] so that the boundary terms don't change). The code performs

Copy specific elements of an array with CUDA Thrust permutation iterator

阅读更多关于 Copy specific elements of an array with CUDA Thrust permutation iterator

问题 I have an array of glm::vec3 with count * 3 elements. I have another array which contains int indices of the elements to copy. An example: thrust::device_vector<glm::vec3> vals(9); // vals contains 9 vec3, which represent 3 "items" // vals[0], vals[1], vals[2] are the first "item", // vals[3], vals[4], vals[5] are the second "item"... int idcs[] = {0, 2}; // index 0 and 2 should be copied, i.e. // vals[0..2] and vals[6..8] I tried to use permutation iterators, but I cannot get it to work. My

How to reduce nonconsecutive segments of numbers in array with Thrust

阅读更多关于 How to reduce nonconsecutive segments of numbers in array with Thrust

问题 I have 1D array "A" which is composed from many arrays "a" like this : I'm implementing a code to sum up non consecutive segments ( sum up the numbers in the segments of the same color of each array "a" in "A" as follow: Any ideas to do that efficiently with thrust? Thank you very much Note: The pictures represents only one array "a". The big array "A" contains many arrays "a" 回答1: In the general case, where the ordering of the data and grouping by segments is not known in advance, the

Using cuda thrust with arrays instead vectors to inclusive_scan

阅读更多关于 Using cuda thrust with arrays instead vectors to inclusive_scan

问题 I have a code given by @m.s.: #include <thrust/device_vector.h> #include <thrust/scan.h> #include <thrust/iterator/transform_iterator.h> #include <thrust/iterator/counting_iterator.h> #include <iostream> struct omit_negative : public thrust::unary_function<int, int> { __host__ __device__ int operator()(int value) { if (value<0) { value = 0; } return value; } }; int main() { int array[] = {2,1,-1,3,-1,2}; const int array_size = sizeof(array)/sizeof(array[0]); thrust::device_vector<int> d_array