how to get cufftcomplex magnitude and phase fast

落花浮王杯 提交于 2021-02-08 10:07:59

问题


i have a cufftcomplex data block which is the result from cuda fft(R2C). i know the data is save as a structure with a real number followed by image number. now i want to get the amplitude=sqrt(R*R+I*I), and phase=arctan(I/R) of each complex element by a fast way(not for loop). Is there any good way to do that? or any library could do that?


回答1:


Since cufftExecR2C operates on data that is on the GPU, the results are already on the GPU, (before you copy them back to the host, if you are doing that.)

It should be straightforward to write your own cuda kernel to accomplish this. The amplitude you're describing is the value returned by cuCabs or cuCabsf in cuComplex.h header file. By looking at the functions in that header file, you should be able to figure out how to write your own that computes the phase angle. You'll note that cufftComplex is just a typedef of cuComplex.

let's say your cufftExecR2C call left some results of type cufftComplex in array data of size sz. Your kernel might look like this:

#include <math.h>
#include <cuComplex.h>
#include <cufft.h>
#define nTPB 256    // threads per block for kernel
#define sz 100000   // or whatever your output data size is from the FFT
...

__host__ __device__ float carg(const cuComplex& z) {return atan2(cuCimagf(z), cuCrealf(z));} // polar angle

__global__ void magphase(cufftComplex *data, float *mag, float *phase, int dsz){
  int idx = threadIdx.x + blockDim.x*blockIdx.x;
  if (idx < dsz){
    mag[idx]   = cuCabsf(data[idx]);
    phase[idx] = carg(data[idx]);
  }
}

...
int main(){
...
    /* Use the CUFFT plan to transform the signal in place. */
    /* Your code might be something like this already:      */
    if (cufftExecR2C(plan, (cufftReal*)data, data) != CUFFT_SUCCESS){
      fprintf(stderr, "CUFFT error: ExecR2C Forward failed");
      return;   
    }
    /* then you might add:                                  */
    float *h_mag, *h_phase, *d_mag, *d_phase;
    // malloc your h_ arrays using host malloc first, then...
    cudaMalloc((void **)&d_mag, sz*sizeof(float));
    cudaMalloc((void **)&d_phase, sz*sizeof(float));
    magphase<<<(sz+nTPB-1)/nTPB, nTPB>>>(data, d_mag, d_phase, sz);
    cudaMemcpy(h_mag, d_mag, sz*sizeof(float), cudaMemcpyDeviceToHost);
    cudaMemcpy(h_phase, d_phase, sz*sizeof(float), cudaMemcpyDeviceToHost);

You can also do this using thrust by creating functors for the magnitude and phase functions, and passing these functors along with data, mag and phase to thrust::transform.

I'm sure you can probably do it with CUBLAS as well, using a combination of vector add and vector multiply operations.

This question/answer may be of interest as well. I lifted my phase function carg from there.



来源:https://stackoverflow.com/questions/18502087/how-to-get-cufftcomplex-magnitude-and-phase-fast

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!