infinity as result in double operation

醉酒当歌 提交于 2019-12-24 15:29:43

问题


I would understand why the result is infinity. I write the code below and I always receive inf as result. There is any precision problem with my code?

#include <stdio.h>
#include <stdlib.h>

#include "cuda.h"
#include "curand_kernel.h"

#define NDIM 30
#define NPAR 5

#define DIMPAR NDIM*NPAR

__device__ double uniform(int index){
    return (double) 0.767341;
}


__global__ void iteracao(double *pos){

    int thread = threadIdx.x + blockDim.x * blockIdx.x;
    double tvel;
    int i = 0;

    double l, r, t;

    if(thread < DIMPAR){
        do{
            t = (double) uniform(thread);
            l = (double) 2.05 * t * ( pos[thread] );
            r = (double) 2.05 * t * ( pos[thread] );
            tvel = (double) l+t+r;
            pos[thread] =  tvel;
            i++;
        }while(i < 10000);
    }

}


int main(int argc, char *argv[])
{

    double *d_pos,    *h_pos;


    h_pos = (double *) malloc(sizeof( double ) * DIMPAR);


    cudaMalloc((void**)&d_pos, DIMPAR   * sizeof( double ));


    int i, j, k, numthreadsperblock, numblocks;

    numthreadsperblock = 512;
    numblocks = (DIMPAR / numthreadsperblock) + ((DIMPAR % numthreadsperblock)?1:0);
    //
    printf("numthreadsperblock: %i;; numblocks:%i\n", numthreadsperblock, numblocks);

    cudaMemset(d_pos,  0.767341, DIMPAR   * sizeof( double ));
    iteracao<<<numblocks,numthreadsperblock>>>(d_pos);
    cudaMemcpy(h_pos, d_pos, DIMPAR * sizeof( double ), cudaMemcpyDeviceToHost);

    printf("\n");
    for(i = 0; i < NPAR; i++){
        for(j = i*NDIM, k = j; j < (k+30); j++){
            printf("%f,", h_pos[j]);
        }
        printf("***\n\n");
    }

    system("PAUSE");
    return 0;
}

the output is always this:

numthreadsperblock: 512;; numblocks:1

inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,*

inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,*

inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,*

inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,*

inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,inf,*


回答1:


You have 2 problems. The first is as described by @Anycorn in the comments. cudaMemset, just like memset expects a byte value and sets byte locations. You cannot use it to initialize float values.

The second is that your kernel has a loop that is operating 10000 times on each pos array element. In effect you are finding the 10000 factorial of a complicated expression. Since that expression is always positive, your answer blows up. In all probability your kernel is not written correctly. It is not doing what you want it to do. Even if you fix your first problem and properly initialize pos to zero, your calculation will still blow up.

The arithmetic you are performing is:

pos[idx] =  0.767341 + (3.1460981 * pos[idx]);

For each idx, you are performing the above operation 10000 times. Even for an initial pos[idx] value equal to zero, by the 2nd iteration of your loop, it will start to take off geometrically.




回答2:


You are init d_pos in a wrong way. cudaMemset() can only set memory byte by byte. see cudaMemset() doc for more details.

To init the array as you intended to, you could use Thrust as a express way.

thrust::fill(
    thrust::device_pointer_cast(d_pos),
    thrust::device_pointer_cast(d_pos) + DIMPAR,
    0.767341);


来源:https://stackoverflow.com/questions/18829412/infinity-as-result-in-double-operation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!