Multiple occurrence subvector search with cuda Thrust

浪子不回头ぞ 提交于 2019-12-14 03:25:18

问题


I want to find occurrences of subvector in a device vector in GPU, with thrust library.

Say for an array of str = "aaaabaaab", I need to find occurrences of substr = "ab".

How shall I use thrust::find function to search a subvector?

In nutshell How shall I implement string search algorithm with thrust library?


回答1:


I would agree with the comments provided that thrust doesn't provide a single function that does this in "typical thrust fashion" and you would not want to use a sequence of thrust functions (e.g. a loop) as that would likely be quite inefficient.

A fairly simple CUDA kernel can be written that does this in a brute-force fashion.

For relatively simple CUDA kernels, we can realize something equivalent in thrust in a "un-thrust-like" fashion, by simply passing the CUDA kernel code as a functor to a thrust per-element operation such as thrust::transform or thrust::for_each.

Here is an example:

$ cat t462.cu
#include <iostream>
#include <thrust/device_vector.h>
#include <thrust/transform.h>
#include <thrust/copy.h>
#include <thrust/iterator/counting_iterator.h>

struct my_f
{
  char *array, *string;
  size_t arr_len;
  int    str_len;
  my_f(char *_array, size_t _arr_len, char *_string, int _str_len) :
    array(_array), arr_len(_arr_len), string(_string), str_len(_str_len) {};
  __host__ __device__
  bool operator()(size_t idx){
    for (int i=0; i < str_len; i++)
      if ((i+idx)>= arr_len) return false;
      else if (array[i+idx] != string[i]) return false;
    return true;
  }
};

int main(){
  char data[] = "aaaabaaab";
  char str[] = "ab";
  size_t data_len = sizeof(data)-1;
  int str_len = sizeof(str)-1;
  thrust::device_vector<char> d_data(data, data+data_len);
  thrust::device_vector<char> d_str(str, str+str_len);
  thrust::device_vector<bool> result(data_len);
  thrust::transform(thrust::counting_iterator<size_t>(0), thrust::counting_iterator<size_t>(data_len), result.begin(), my_f(thrust::raw_pointer_cast(d_data.data()), data_len, thrust::raw_pointer_cast(d_str.data()), str_len));
  thrust::copy(result.begin(), result.end(), std::ostream_iterator<bool>(std::cout, ","));
  std::cout << std::endl;
}
$ nvcc -o t462 t462.cu
$ ./t462
0,0,0,1,0,0,0,1,0,
$

Whether or not such a "brute-force" approach is efficient for this type of problem I don't know. Probably there are better/more efficient methods, especially when searching for occurrence of longer strings.



来源:https://stackoverflow.com/questions/55999456/multiple-occurrence-subvector-search-with-cuda-thrust

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!