cuBLAS matrix inverse much slower than MATLAB

无人久伴 提交于 2019-12-06 10:49:44

As @RobertCrovella said, you should not use batched small matrix APIs for a single large matrix inversion.

Basically you could use the same method as in your code, but with the non-batched version of getrf() and getri() to maximum the performance for large matrix.

For getrf() you could find it here.

http://docs.nvidia.com/cuda/cusolver/index.html#cuds-lt-t-gt-getrf

For getri(), although CUDA toolkit does not provide a getri() to solve AX=I, where A is LU-facotored by getrf(), it does provide a getrs() to solve AX=B. All you need to do is to set B=I before calling getrs().

http://docs.nvidia.com/cuda/cusolver/index.html#cuds-lt-t-gt-getrs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!