Call cublas in a kernel

佐手、 提交于 2019-12-31 03:33:31

问题


I want to use Zgemv in parallel.

__global__ void S_Cphir(cuDoubleComplex *S,cuDoubleComplex *A,cuDoubleComplex *B, int n,int l)
{
    ....
cublasZgemv(handle,CUBLAS_OP_N,n,n,&alpha,S+i*n*n,n,A+n*i,1,&beta,B+i*n,1);}

void S_Cphir_(cuDoubleComplex *S,cuDoubleComplex *A,cuDoubleComplex *B, int n,int l){
dim3 grid = dim3(1,1,1);
dim3 block = dim3(32,1,1);
S_Cphir<<<grid,block>>>(S,A,B,n,l);}

my compile command is

nvcc -c -arch=compute_30 -code=sm_35 time_propagation_cublas.cu --relocatable-device-code true
nvcc -o  ./main.v2 time_propagation_cublas.o -lcublas

The first line is work. But the second line is wrong!!

In function`__sti____cudaRegisterAll_58_tmpxft_000032b7_00000000_6_time_propagation_cublas_cpp1_ii_0d699356()';tmpxft_000032b7_00000000-3_time_propagation_cublas.cudafe1.cpp:(.text+0x17a4): 
undefined reference to `__cudaRegisterLinkedBinary_58_tmpxft_000032b7_00000000_6_time_propagation_cublas_cpp1_ii_0d699356'
collect2: ld returned 1 exit status

I search the "cudaRegisterLinkedBinary" but I have nothing!!

I know nvcc support to call cublas in kernel.


回答1:


Use the CUBLAS Device Library sample code as your reference. On a standard CUDA 5.5 install, you'll find it at:

/usr/local/cuda/samples/7_CUDALibraries/simpleDevLibCUBLAS

Referring to the Makefile in that directory, your compile commands should be like this:

nvcc -arch=sm_35 -rdc=true -o main.v2 time_propagation_cublas.cu -lcublas -lcublas_device -lcudadevrt


来源:https://stackoverflow.com/questions/19462779/call-cublas-in-a-kernel

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!