thrust

Making the number of key occurances equal using CUDA / Thrust

空扰寡人 提交于 2019-12-02 13:01:59
Is there an efficient way to take a sorted key/value array pair and ensure that each key has an equal number of elements using the CUDA Thrust library? For instance, assume we have the following pair of arrays: ID: 1 2 2 3 3 3 VN: 6 7 8 5 7 8 If we want to have two of each key appear, this would be the result: ID: 2 2 3 3 VN: 7 8 5 7 The actual arrays will be much larger, containing millions of elements or more. I'm able to do this using nested for-loops easily, but I'm interested in knowing whether or not there's a more efficient way to convert the arrays using a GPU. Thrust seems as though

Bad GPU performance when compiling with -G parameter with nvcc compiler

白昼怎懂夜的黑 提交于 2019-12-02 10:26:15
I am doing some tests and I realized that using the -G parameter when compiling is giving me a bad performance than without it. I have checked the documentation in Nvidia: --device-debug (-G) Generate debug information for device code. But it is not helping me to know the reason why is giving me such bad performance. Where is it generating this debug information and when? and what could be the cause of this bad performance? Using the -G switch disables most compiler optimizations that nvcc might do in device code. The resulting code will often run slower than code that is not compiled with -G

How to pass an array of vectors to cuda kernel?

时光毁灭记忆、已成空白 提交于 2019-12-02 09:10:58
I now have thrust::device_vector<int> A[N]; and my kernel function __global__ void kernel(...) { auto a = A[threadIdx.x]; } I know that via thrust::raw_pointer_cast I could pass a device_vector to kernel. But how could I pass an array of vector to it? talonmies The really short answer is that you basically can't, and the longer answer is that you really shouldn't even if you discover or are presented with a hacky way of doing this. And in the spirit of that advice, what you can do is something like this: thrust::device_vector<int> A(N); thrust::device_vector<int> B(N); thrust::device_vector

Thrust copy - OutputIterator column-major order

怎甘沉沦 提交于 2019-12-02 08:50:34
I have a vector of matrices (stored as column major arrays) that I want to concat vertically. Therefore, I want to utilize the copy function from the thrust framework as in the following example snippet: int offset = 0; for(int i = 0; i < matrices.size(); ++i) { thrust::copy( thrust::device_ptr<float>(matrices[i]), thrust::device_ptr<float>(matrices[i]) + rows[i] * cols[i], thrust::device_ptr<float>(result) + offset ); offset += rows[i] * cols[i]; } EDIT: extended example: The problem is, that if I have a matrix A = [[1, 2, 3], [4, 5, 6]] (2 rows, 3 cols; in memory [1, 4, 2, 5, 3, 6]) and

VS program crashes in debug but not release mode?

寵の児 提交于 2019-12-02 08:06:58
I am running the following program in VS 2012 to try out the Thrust function find: #include "cuda_runtime.h" #include "device_launch_parameters.h" #include <thrust/find.h> #include <thrust/device_vector.h> #include <stdio.h> int main() { thrust::device_vector<char> input(4); input[0] = 'a'; input[1] = 'b'; input[2] = 'c'; input[3] = 'd'; thrust::device_vector<char>::iterator iter; iter = thrust::find(input.begin(), input.end(), 'a'); std::cout << "Index of a = " << iter - input.begin() << std::endl; return 0; } This is a modified version of a code example taken from http://docs.thrust

Function object not working properly

*爱你&永不变心* 提交于 2019-12-02 07:45:40
I have defined the following function object: struct Predicate1 { __device__ bool operator () (const DereferencedIteratorTuple& lhs, const DereferencedIteratorTuple& rhs) { using thrust::get; //if you do <=, returns last occurence of largest element. < returns first if (get<0>(lhs)== get<2>(lhs) && get<0>(lhs)!= 3) return get<1>(lhs) < get<1>(rhs); else return true ; } }; where the DereferencedIteratorTuple is as follows: typedef thrust::tuple<int, float,int> DereferencedIteratorTuple; Moreover, i call it as follows: result = thrust::max_element(iter_begin, iter_end, Predicate1()); But the

Retain Duplicates with Set Intersection in CUDA

假如想象 提交于 2019-12-02 07:08:37
I'm using CUDA and THRUST to perform paired set operations. I would like to retain duplicates , however. For example: int keys[6] = {1, 1, 1, 3, 4, 5, 5}; int vals[6] = {1, 2, 3, 4, 5, 6, 7}; int comp[2] = {1, 5}; thrust::set_intersection_by_key(keys, keys + 6, comp, comp + 2, vals, rk, rv); Desired result rk[1, 1, 1, 5, 5] rv[1, 2, 3, 6, 7] Actual Result rk[1, 5] rv[5, 7] I want all of the vals where the corresponding key is contained in comp . Is there any way to achieve this using thrust, or do I have to write my own kernel or thrust function? I'm using this function: set_intersection_by

thrust count occurence [duplicate]

送分小仙女□ 提交于 2019-12-02 06:35:58
Possible Duplicate: Counting occurences of numbers in cuda array is there a way to use thrust or cuda to count occurrence for the duplicates in an array? for example if I have a device vector { 11, 11, 9, 1, 3, 11, 1, 2, 9, 1, 11} I should get 1 :3 2:1 3:1 9:2, 11:4 if thrust cannot do that, How can I use a kernel to do that? Thanks! I am doing concentration calculation. that's why I am asking this question. assume there are 100000 particles in the domain which has nx X ny X nz cells, i need to calculate the concentration of each cell(how many particles in each cell) My kernel is this __global

Thrust reduce not working with non equal input/output types

故事扮演 提交于 2019-12-02 05:06:38
I'm attempting to reduce the min and max of an array of values using Thrust and I seem to be stuck. Given an array of floats what I would like is to reduce their min and max values in one pass, but using thrust's reduce method I instead get the mother (or at least auntie) of all template compile errors. My original code contains 5 lists of values spread over 2 float4 arrays that I want reduced, but I've boiled it down to this short example. struct ReduceMinMax { __host__ __device__ float2 operator()(float lhs, float rhs) { return make_float2(Min(lhs, rhs), Max(lhs, rhs)); } }; int main(int

Thrust Sort by key on the fly or different approach?

大城市里の小女人 提交于 2019-12-01 22:57:50
I was wondering if it is possible to sort by keys using Thrust Library without the need of creating a Vector to store the keys (on the fly). For example I have the following two vectors: Keys and Values: vectorKeys: 0, 1, 2, 0, 1, 2, 0, 1, 2 VectorValues: 10, 20, 30, 40, 50, 60, 70, 80, 90 After sort by keys: thrust::sort_by_key(vKeys.begin(), vKeys.end(), vValues.begin()); The Resulting vectors are: vectorKeys: 0, 0, 0, 1, 1, 1, 2, 2, 2 VectorValues: 10, 40, 70, 20, 50, 80, 30, 60, 90 What I would like to know if it is possible to sort_by_key without the need of the vKeys vector (on the fly),