thrust | 易学教程

Purpose and usage of counting_iterators in CUDA Thrust library

阅读更多关于 Purpose and usage of counting_iterators in CUDA Thrust library

问题 I have trouble understanding counting_iterator in thrust library for CUDA. What is its purpose and how is it used? Is it available in other programming languages such as C++ also? 回答1: A counting iterator is just an iterator which returns the next value from a sequence which is advanced each time the iterator is incremented. The simplest possible example is something like this: #include <iostream> #include <thrust/iterator/counting_iterator.h> int main(void) { int n = 10; thrust::counting

How to use make_transform_iterator() with counting_iterator<> and execution_policy in Thrust?

阅读更多关于 How to use make_transform_iterator() with counting_iterator and execution_policy in Thrust?

问题 I try to compile this code with MSVS2012, CUDA5.5, Thrust 1.7: #include <iostream> #include <thrust/iterator/counting_iterator.h> #include <thrust/iterator/transform_iterator.h> #include <thrust/find.h> #include <thrust/execution_policy.h> struct is_odd { __host__ __device__ bool operator()(uint64_t &x) { return x & 1; } }; int main() { thrust::counting_iterator<uint64_t> first(0); thrust::counting_iterator<uint64_t> last = first + 100; auto iter = thrust::find(thrust::device, thrust::make

How does Thrust know how to automatically configure the kernels it launches?

阅读更多关于 How does Thrust know how to automatically configure the kernels it launches?

问题 Thrust is able to hide a variety of details from the coder and it is claimed that Thrust sets the parameters to some degree with respect to the system specifications. How does Thrust choose the best parameterization and how does it handle a variety of codes in different machines? What is Thrust's approach to implementing such a generic library? 回答1: Thrust uses a heuristic which attempts to maximize the potential occupancy of the CUDA kernels it launches. A standalone version of the heuristic

Applying reduction operation using Thrust subject to a boolean condition

阅读更多关于 Applying reduction operation using Thrust subject to a boolean condition

问题 I want to use thrust::reduce to find the max value in an array A. However, A[i] should only be chosen as the max if it also satisfies a particular boolean condition in another array B. For example, B[i] should be true. Is their a version of thrust::reduce that does this. I looked at the documentation and found only following API; thrust::reduce(begin,end, default value, operator) However, i was curious is their a version more suitable to my problem? EDIT: Compilation fails in last line!

how to free device_vector<int>

阅读更多关于 how to free device_vector

问题 I allocated some space using thrust device vector as follows: thrust::device_vector<int> s(10000000000); How do i free this space explicitly and correctly ? 回答1: device_vector deallocates the storage associated when it goes out of scope, just like any standard c++ container. If you'd like to deallocate any Thrust vector 's storage manually during its lifetime, you can do so using the following recipe: // empty the vector vec.clear(); // deallocate any capacity which may currently be

cuda/thrust: Trying to sort_by_key 2.8GB of data in 6GB of GPU RAM throws bad_alloc

阅读更多关于 cuda/thrust: Trying to sort_by_key 2.8GB of data in 6GB of GPU RAM throws bad_alloc

问题 I have just started using thrust and one of the biggest issues I have so far is that there seems to be no documentation as to how much memory operations require. So I am not sure why the code below is throwing bad_alloc when trying to sort (before the sorting I still have >50% of GPU memory available, and I have 70GB of RAM available on the CPU)--can anyone shed some light on this? #include <thrust/device_vector.h> #include <thrust/sort.h> #include <thrust/random.h> void initialize_data

Sorting objects with Thrust CUDA

阅读更多关于 Sorting objects with Thrust CUDA

问题 Is it possible to sort objects using the Thrust library? I have the following struct: struct OB{ int N; Cls *C; //CLS is another struct. } Is it possible to use thrust in order to sort an array of OB according to N? Can you provide a simple example on using thrust to sort objects? If thrust is not able to do so, is there any other CUDA libraries that allows me to do so? 回答1: The docs for thrust::sort show it accepts a comparison operator. See in their example how those are defined and used. I

Converting thrust::iterators to and from raw pointers

阅读更多关于 Converting thrust::iterators to and from raw pointers

问题 I want to use Thrust library to calculate prefix sum of device array in CUDA. My array is allocated with cudaMalloc() . My requirement is as follows: main() { Launch kernel 1 on data allocated through cudaMalloc() // This kernel will poplulate some data d. Use thrust to calculate prefix sum of d. Launch kernel 2 on prefix sum. } I want to use Thrust somewhere between my kernels so I need method to convert pointers to device iterators and back.What is wrong in following code? int main() { int

Computing all-pairs distances between points in different sets with CUDA

阅读更多关于 Computing all-pairs distances between points in different sets with CUDA

I am trying to implement a brute force distance computation algorithm in CUDA. #define VECTOR_DIM 128 thrust::device_vector<float> feature_data_1; feature_data_1.resize(VECTOR_DIM * 1000); // 1000 128 dimensional points thrust::device_vector<float> feature_data_2; feature_data_2.resize(VECTOR_DIM * 2000); // 2000 128 dimensional points Now what I would like to do is to compute the L2 distances (sum of the squared differences) from every vector in the first matrix to every vector in the second matrix. So, if array 1 is of size 1000 and array 2 is of size 2000 , the result would be a floating

CUDA: how to use thrust::sort_by_key directly on the GPU? [duplicate]

阅读更多关于 CUDA: how to use thrust::sort_by_key directly on the GPU? [duplicate]

问题 This question already has answers here : Thrust inside user written kernels (4 answers) Closed 4 years ago . The Thrust library can be used to sort data. The call might look like this (with a keys and a values vector): thrust::sort_by_key(d_keys.begin(), d_keys.end(), d_values.begin()); called on the CPU, with d_keys and d_values being in the CPU memory; and the bulk of the execution happens on the GPU. However, my data is already on the GPU? How can I use the Thrust library to perform