Can anyone describe the differences between __global__ and __device__ ?
When should I use __device__, and when to use __glob
Differences between __device__ and __global__ functions are:
__device__ functions can be called only from the device, and it is executed only in the device.
__global__ functions can be called from the host, and it is executed in the device.
Therefore, you call __device__ functions from kernels functions, and you don't have to set the kernel settings. You can also "overload" a function, e.g : you can declare void foo(void) and __device__ foo (void), then one is executed on the host and can only be called from a host function. The other is executed on the device and can only be called from a device or kernel function.
You can also visit the following link: http://code.google.com/p/stanford-cs193g-sp2010/wiki/TutorialDeviceFunctions, it was useful for me.