CUB::BlockRadixSort: how to deal with the last tile which is not full?

痞子三分冷 提交于 2020-06-29 03:59:29

问题


There are 510 keys for sort. BLOCK_DIM_X = 128, ITEMS_PER_THREAD = 4, thus every tile covers 512 keys. We lauch kenel by 1 block.

my kernel looks like this:

    typedef cub::BlockRadixSort<int, 128, 4>  BlockRadixSort;
    int thread_data[4];
    BlockLoad(temp_storage.load).Load(in_data, thread_data);
    CTA_SYNC();

    BlockRadixSort(temp_storage.sort).Sort(thread_data);
    CTA_SYNC();

    BlockStore(temp_storage.store).Store(out_data, thread_data);
    CTA_SYNC();

The problem is BlockRadixSort sort 512 keys, not 510. How to exclude the last 2 items from block sort?

来源:https://stackoverflow.com/questions/62170084/cubblockradixsort-how-to-deal-with-the-last-tile-which-is-not-full

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!