问题
Does it make sense to use an unsigned short integer for registers (for saving register's memory) and shared memory (faster access) in CUDA programs?
I create template device function (using registers and shared memory) and specialize it for uint and ushort. Use: For uint: 25 registers and speed 460 MB/sec. For ushort: 26 registers and speed 420 MB/sec.
So there is no reason to use unsigned short int.
回答1:
I don't have big experience with CUDA, but I've read, that we should avoid using unsigned types (Cuda C Best Practices Guide).
Using shared memory can be the best way to increase performance in our apps. You should think how to optimize your kernel. When you often read the same value from global memory or you need to use one thread in kernel (all kernel load data to SM and stop but first read data from SM not from global memory), use shared memory.
Everything depends on what do you want to do. If you want to optimize kernel, please post some code.
来源:https://stackoverflow.com/questions/12733856/does-it-make-sense-to-use-an-unsigned-short-integer-for-registers-and-shared-mem