CUDA kernel - nested for loop

后端 未结 3 1274
[愿得一人]
[愿得一人] 2020-12-08 09:01

Hello I\'m trying to write a CUDA kernel to perform the following piece of code.

for (n = 0; n < (total-1); n++)
{
  a = values[n];

  for ( i = n+1; i &         


        
3条回答
  •  南笙
    南笙 (楼主)
    2020-12-08 09:51

    Why don't you just remove the outter loop and start the kernel with as many threads as you need for this loop? It's a bit weird to have a loop that depends on your blockId. Normally you try to avoid these loops. Secondly it seems to me that newvalues[i] can be overriden by different threads.

提交回复
热议问题