Insight into the first argument mask in __shfl__sync()

这一生的挚爱 提交于 2019-12-24 23:19:15

问题


Here is the test code for broadcasting variable:

#include <stdio.h>
#include <cuda_runtime.h>

__global__ void broadcast(){
    int lane_id = threadIdx.x & 0x1f;
    int value = 31 - lane_id;
    //let all lanes within the warp be broadcasted the value 
    //whose laneID is 2 less than that of current lane
    int broadcasted_value = __shfl_up_sync(0xffffffff, value, 2)
    value = n;
    printf("thread %d final value = %d\n", threadIdx.x, value);
}

int main() {
    broadcast<<<1,32>>>();
    cudaDeviceSynchronize();
    return 0;
}

In effect, this question is the same as that of this page. Results of shuffling didn't vary at all, whatever I modified the mask(e.g. 0x00000000, 0x00000001, etc.). So, how to properly understand the effects of mask?


回答1:


The mask parameter forces warp reconvergence, for warp lanes identified with a 1 bit, prior to performing the requested shuffle operation (assuming such reconvergence is possible, i.e. not prevented by conditional coding. If prevented by conditional coding, your code is illegal, and exploring undefined behavior - UB).

For warp lanes that are already converged and active, it has no effect. It does not prevent lanes from participating in the shuffle operation if the mask parameter is set to zero. It also does not force inactive lanes to participate (inactive lanes would be lanes that are excluded by conditional coding).

Since your code has no conditional behavior, there is no reason to believe there would be any lack of convergence, and therefore no change in behavior regardless of your mask parameter.

That does not mean it is correct to specify a mask of 0. Your code is illegal if you expect lanes to participate but have not set their corresponding bit to 1 in the mask, and you would potentially be exploring UB in the event of warp divergence.

For other descriptions of the mask, there are a number of answers here already.

1. 2. 3. 4. 5.

There's a chance any follow-up questions you may have are already answered in one of those.



来源:https://stackoverflow.com/questions/58833808/insight-into-the-first-argument-mask-in-shfl-sync

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!