PyCUDA: Pow within device code tries to use std::pow, fails

℡╲_俬逩灬. 提交于 2020-11-29 05:50:10

问题


Question more or less says it all.

calling a host function("std::pow<int, int> ") from a __device__/__global__ function("_calc_psd") is not allowed

from my understanding, this should be using the cuda pow function instead, but it isn't.


回答1:


The error is exactly as the compiler is reported. You can't used host functions in device code, and that include the whole host C++ std library. CUDA includes its own standard library, described in the programming guide, but you should use either pow or fpow (taken from the C standard library, no C++ or namespaces). nvcc will overload the function with the cuda correct device function and inline the resulting code. Something like the following will work:

#include <math.h>

__device__ float func(float x) {

   return x * x * fpow(x, 0.123456f);
}

EDIT: The bit I missed the first time is the template specifier reported in the errors. Are you sure that you are passing either float or double arguments to pow? If you are passing integers, there is no overload function in the CUDA standard library, which is why it might be failing. If you need an integer pow function, you will have to roll your own (or do casting, but pow is a rather expensive function and I am certain some cascaded integer multiplication will be faster).



来源:https://stackoverflow.com/questions/5656605/pycuda-pow-within-device-code-tries-to-use-stdpow-fails

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!