How to specify alignment for global device variables in CUDA

风流意气都作罢 提交于 2019-12-10 10:04:41

问题


I would like to declare the alignment for a global device variable in CUDA. Specifically, I have a string declaration, like __device__ char str1 = "some pre-defined string"; In normal gcc, I can request alignment from the compiler as __device__ char str1 __attribute__ ((aligned (4))) = "some pre-defined string";

However, when I tried this on nvcc, the compiler ignores these requests. The reason I would like to do this is to copy these strings onto a buffer in my kernels, and copying words at a time is much faster than copying bytes at a time, though they require that the src string be aligned. Can anyone please tell me how to request alignment from the nvcc compiler?


回答1:


See section 5.3.2 "Size and Alignment Requirement" of the "CUDA C Programming Guide", which can be found here:

The alignment requirement is automatically fulfilled for the built-in types of char, short, int, long, longlong, float, double like float2 or float4.

For structs, the size and alignment requirements can be enforced by the compiler using the alignment specifiers __align__(8) or __align__(16).

Example usage:

struct __align__(8) { 
    float r; 
    float i;
} complex_num;



回答2:


Can you check if this works?

__device__ char __align__(4) str1 = "some pre-defined string";


来源:https://stackoverflow.com/questions/13205742/how-to-specify-alignment-for-global-device-variables-in-cuda

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!