What is the most efficient way to transpose a matrix in CUDA?

后端未结

关注

 3  1793

予麋鹿 2021-01-06 16:27

I have a M*N host memory matrix, and upon copying into a device memory, I need it to be transposed into a N*M matrix. Is there any cuda (cuBLAS...)

3条回答

一向 (楼主)

2021-01-06 17:30
In the cublas API:
```
cublasgeam()

This function performs the matrix-matrix addition/transposition
the user can transpose matrix A by setting *alpha=1 and *beta=0.  
```
(and specifying the transa operator as CUBLAS_OP_T for transpose)
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...