Could a CUDA kernel call a cublas function?

ε祈祈猫儿з 提交于 2019-12-03 19:51:56

问题


I know it sound weird, but here is my scenario:

I need to do a matrix-matrix multiplication (A(n*k)*B(k*n)), but I only needs the diagonal elements to be evaluated for the output matrix. I searched cublas library and didn't find any level 2 or 3 functions that can do that. So, I decided to distribute each row of A and each column of B into CUDA threads. For each thread (idx), I need to calculate the dot product "A[idx,:]*B[:,idx]" and save it as the corresponding diagonal output. Now since this dot product also takes some time, and I wonder whether I could somehow call cublas function here (say cublasSdot) to achieve it.

If I missed some cublas function that can achieve my goal directly (only calculate the diagonal elements for a matrix-matrix multiplication), this question could be discarded.


回答1:


Yes it can.

"The language interface and Device Runtime API available in CUDA C/C++ is a subset of the CUDA Runtime API available on the Host. The syntax and semantics of the CUDA Runtime API have been retained on the device in order to facilitate ease of code reuse for API routines that may run in either the host or device environments. A kernel can also call GPU libraries such as CUBLAS directly without needing to return to the CPU." Source

Here you can see and Matrix-Vector Multiplication using cuda and CUBLAS library function cublasSgemv.




回答2:


Make sure you are using the device library to call the cublas. You can't use the same library that you used to call it from the host; details about using the cuda device library can be found on cuda toolkit: http://docs.nvidia.com/cuda/cublas/index.html#device-api

Look at the cuda 5 samples under 7_CUDALibraries/ .



来源:https://stackoverflow.com/questions/13371082/could-a-cuda-kernel-call-a-cublas-function

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!