What is the most efficient way to transpose a matrix in CUDA?

后端未结

关注

 3  1770

予麋鹿 2021-01-06 16:27

I have a M*N host memory matrix, and upon copying into a device memory, I need it to be transposed into a N*M matrix. Is there any cuda (cuBLAS...)

3条回答

萌比男神i (楼主)

2021-01-06 17:16

CULA has auxiliary routines to compute the transpose (culaDevice?geTranspose). In case of a square matrix you could also use inplace transposition (culaDevise?geTransposeInplace).

Note: CULA has a free license available, if you meet certain conditions.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...