Fusing a triangle loop for parallelization, calculating sub-indices

前端 未结 3 1753
余生分开走
余生分开走 2020-12-10 06:42

A common technique in parallelization is to fuse nested for loops like this

for(int i=0; i

to

3条回答
  •  我在风中等你
    2020-12-10 07:12

    Considering that you're trying to fuse a triangle with the intent of parallelizing, the non-obvious solution is to choose a non-trivial mapping of x to (i,j):

    j |\ i ->
      | \             ____
    | |  \    =>    |\\   |
    V |___\         |_\\__|
    

    After all, you're not processing them in any special order, so the exact mapping is a don't care.

    So calculate x->i,j as you'd do for a rectangle, but if i > j then { i=N-i, j = N-j } (mirror Y axis, then mirror X axis).

       ____
     |\\   |      |\           |\
     |_\\__|  ==> |_\  __  =>  | \
                      / |      |  \
                     /__|      |___\
    

提交回复
热议问题