thrust

proper thrust call for subtraction

泪湿孤枕 提交于 2019-12-14 02:46:49
问题 Following from here. Assuming that dev_X is a vector. int * X = (int*) malloc( ThreadsPerBlockX * BlocksPerGridX * sizeof(*X) ); for ( int i = 0; i < ThreadsPerBlockX * BlocksPerGridX; i++ ) X[ i ] = i; // create device vectors thrust::device_vector<int> dev_X ( ThreadsPerBlockX * BlocksPerGridX ); //copy to device thrust::copy( X , X + theThreadsPerBlockX * theBlocksPerGridX , dev_X.begin() ); The following is making a subtraction: thrust::transform( dev_Kx.begin(), dev_Kx.end(), dev_X.begin

CUDA Thrust and sort_by_key

人走茶凉 提交于 2019-12-13 12:29:08
问题 I’m looking for a sorting algorithm on CUDA that can sort an array A of elements (double) and returns an array of keys B for that array A. I know the sort_by_key function in the Thrust library but I want my array of elements A to remain unchanged. What can I do? My code is: void sortCUDA(double V[], int P[], int N) { real_t *Vcpy = (double*) malloc(N*sizeof(double)); memcpy(Vcpy,V,N*sizeof(double)); thrust::sort_by_key(V, V + N, P); free(Vcpy); } i'm comparing the thrust algorithm against

Cuda thrust - xutility: name followed by “::” must be a class or namespace

元气小坏坏 提交于 2019-12-13 11:01:34
问题 I would like to use thrust reduction in my CUDA application. Hence I include the header and call the function: #include <thrust\reduce.h> __host__ void reduction(){ unsigned int t = 0; thrust::reduce(t,t); } However I get compile errors (only one type): "name followed by "::" must be a class or namespace". The problem is with a file called xutility (which i haven't touched). All errors are related to the follow class definition: // TEMPLATE CLASS iterator_traits template<class _Iter> struct

How to search the value from a std::map when I use cuda?

混江龙づ霸主 提交于 2019-12-13 09:34:00
问题 I have something stored in std::map, which maps string to vector. Its keys and values looks like key value "a"-----[1,2,3] "b"-----[8,100] "cde"----[7,10] For each thread, it needs to process one query. The query looks like ["a", "b"] or ["cde", "a"] So I need to get the value from the map and then do some other jobs like combine them. So as for the first query, the result will be [1,2,3,8,100] The problem is, how can threads access the map and find the value by a key? At first, I tried to

Thrust - How to use my array/data - model

拈花ヽ惹草 提交于 2019-12-13 06:44:15
问题 I am new to thrust (cuda) and I want to do some array operations but I don´t find any similar example on the internet. I have following two arrays (2d): a = { {1, 2, 3}, {4} } b = { {5}, {6, 7} } I want that thrust compute this array: c = { {1, 2, 3, 5}, {1, 2, 3, 6, 7}, {1, 2, 3, 5}, {1, 2, 3, 6, 7} } I know how it works in c/c++ but not how to say thrust to do it. Here is my idea how it wohl maybe could work: Thread 1: Take a[0] -> expand it with b. Write it to c. Thread 2: Take a[1] ->

Splicing two different length vectors based on their respective index vectors containing global addresses to new vectors of common length with thrust

Deadly 提交于 2019-12-13 05:05:23
问题 this problem has been on my mind for several years. I have been learning a great deal of c++ and cuda from this forum. Previously I wrote the following in fortran serial code with a lot of conditional statements, and using gotos because I could not find a clever way to do it. Here is the problem. Given 4 vectors: int indx(nshape); float dnx(nshape); /* nshape > nord */ int indy(nord); float dny(nord); indx and indy are index vectors (keys for values dnx, dny respectively) containing global

Simple Thrust code performs about half as fast as my naive cuda kernel. Am I using Thrust wrong?

天涯浪子 提交于 2019-12-13 03:51:34
问题 I'm pretty new to Cuda and Thrust, but my impression was that Thrust, when used well, is supposed to offer better performance than naively written Cuda kernels. Am I using Thrust in a sub-optimal way? Below is a complete, minimal example that takes an array u of length N+2 , and for each i between 1 and N computes the average 0.5*(u[i-1] + u[i+1]) and puts the result in uNew[i] . ( uNew[0] is set to u[0] and u[N+1] is set to u[N+1] so that the boundary terms don't change). The code performs

Copy specific elements of an array with CUDA Thrust permutation iterator

柔情痞子 提交于 2019-12-13 03:39:57
问题 I have an array of glm::vec3 with count * 3 elements. I have another array which contains int indices of the elements to copy. An example: thrust::device_vector<glm::vec3> vals(9); // vals contains 9 vec3, which represent 3 "items" // vals[0], vals[1], vals[2] are the first "item", // vals[3], vals[4], vals[5] are the second "item"... int idcs[] = {0, 2}; // index 0 and 2 should be copied, i.e. // vals[0..2] and vals[6..8] I tried to use permutation iterators, but I cannot get it to work. My

How to reduce nonconsecutive segments of numbers in array with Thrust

喜你入骨 提交于 2019-12-13 03:06:03
问题 I have 1D array "A" which is composed from many arrays "a" like this : I'm implementing a code to sum up non consecutive segments ( sum up the numbers in the segments of the same color of each array "a" in "A" as follow: Any ideas to do that efficiently with thrust? Thank you very much Note: The pictures represents only one array "a". The big array "A" contains many arrays "a" 回答1: In the general case, where the ordering of the data and grouping by segments is not known in advance, the

Using cuda thrust with arrays instead vectors to inclusive_scan

 ̄綄美尐妖づ 提交于 2019-12-13 02:34:42
问题 I have a code given by @m.s.: #include <thrust/device_vector.h> #include <thrust/scan.h> #include <thrust/iterator/transform_iterator.h> #include <thrust/iterator/counting_iterator.h> #include <iostream> struct omit_negative : public thrust::unary_function<int, int> { __host__ __device__ int operator()(int value) { if (value<0) { value = 0; } return value; } }; int main() { int array[] = {2,1,-1,3,-1,2}; const int array_size = sizeof(array)/sizeof(array[0]); thrust::device_vector<int> d_array