cula | 易学教程

how to use the cula device

阅读更多关于 how to use the cula device

问题 I am a little confused on how to use the cula device interface. Right now , I am using the cula interface on a cpp file and I am generating some random numbers from a cu file. cu file: ... __global__ void kernel( double * A,double * B, curandState * globalState, int Asize, int Bsize ) { // generate random numbers ... void kernel_wrapper( double ** const A_host, double ** const B_host, const int Asize , const int Bsize ) { ... // create random states curandState * devStates; gpuErrchk(

gputools: error in installation

阅读更多关于 gputools: error in installation

问题 I am setting up a new Dell Precision workstation with an NVidia Tesla 2050 GPU card. I would like to install R's package gputools. My OS is openSuse 11.3 with KDE 4.4. I downloaded NVidia's CUDA Toolkit 3.2 and installed it in /usr/local/cuda, I also downloaded the latest version of the CULA Tools set (version R10) and installed it in /usr/local/cula. When trying to install gputools from within R using: install.packages("gputools") I get the following error message: classification.cu(735):

Rcpp and CULA: segmentation fault

阅读更多关于 Rcpp and CULA: segmentation fault

问题 I extracted the relevant bits from the gputools R -package to run a QR decomposition on my GPU using Rcpp by dynamically loading a shared library that links to culatools . Everything runs smoothly in the terminal and R.app on my Mac. The results agree with R 's qr() function, but the problem is that a segmentation fault occurs on exiting R.app (the error does not occur when using the terminal): *** caught segfault *** address 0x10911b050, cause 'memory not mapped' I think I narrowed down the

Can CULA routines be called from device kernels?

阅读更多关于 Can CULA routines be called from device kernels?

So I'm trying to see if I can get some significant speedup from using a GPU to solve a small overdetermined system of equations by solving a bunch at the same time. My current algorithm involves using an LU decomposition function from the CULA Dense library that also has to switch back and forth between the GPU and the CPU to initialize and run the CULA functions. I would like to be able to call the CULA functions from my CUDA kernels so that I don't have to jump back to the CPU and copy the data back. This would also allow me to create multiple threads that are working on different data sets

Can CULA routines be called from device kernels?

阅读更多关于 Can CULA routines be called from device kernels?

问题 So I'm trying to see if I can get some significant speedup from using a GPU to solve a small overdetermined system of equations by solving a bunch at the same time. My current algorithm involves using an LU decomposition function from the CULA Dense library that also has to switch back and forth between the GPU and the CPU to initialize and run the CULA functions. I would like to be able to call the CULA functions from my CUDA kernels so that I don't have to jump back to the CPU and copy the