cuBLAS argmin — segfault if outputing to device memory?

后端 未结 1 537
闹比i
闹比i 2020-12-11 19:05

In cuBLAS, cublasIsamin() gives the argmin for a single-precision array.

Here\'s the full function declaration: cublasStatus_t cublasIsamin(cubla

相关标签:
1条回答
  • 2020-12-11 19:44

    The CUBLAS V2 API does support writing scalar results to device memory. But it doesn't support this by default. As per Section 2.4 "Scalar parameters" of the documentation, you need to use cublasSetPointerMode() to make the API aware that scalar argument pointers will reside in device memory. Note this also makes these level 1 BLAS functions asynchronous, so you must ensure that the GPU has completed the kernel(s) before trying to access the result pointer.

    See this answer for a complete working example.

    0 讨论(0)
提交回复
热议问题