using constant memory prints address instead of value in cuda

谁都会走 提交于 2019-12-03 00:49:47

问题


I am trying to use the constant memory in the code with constant memory assigned value from kernel not using cudacopytosymbol.

 #include <iostream>
    using namespace std;
    #define N 10
    //__constant__ int constBuf_d[N];
    __constant__ int *constBuf;

__global__ void foo( int *results )
{
    int tdx = threadIdx.x;
    int idx = blockIdx.x * blockDim.x + tdx;


    if( idx < N )
    {
        constBuf[idx]=1;
         results[idx] = constBuf[idx];
    }
}

// main routine that executes on the host
int main(int argc, char* argv[])
{
    int *results_h = new int[N];
    int *results_d;


    cudaMalloc((void **)&results_d, N*sizeof(int));

    foo <<< 1, 10 >>> ( results_d );

    cudaMemcpy(results_h, results_d, N*sizeof(int), cudaMemcpyDeviceToHost);

    for( int i=0; i < N; ++i )
        printf("%i ", results_h[i] );
        delete(results_h);
}

output shows

6231808 6226116 0 0 0 0 0 0 0 0 

I want the program to print the value assigned to constant memory through the kenel in the code.


回答1:


Constant memory is, as the name implies, constant/read-only with respect to device code. What you are trying to do is illegal and can't be made to work.

To set values in constant memory, you currently have two choices:

  1. set the value from host code via the cudaMemcpyToSymbol API call (or its equivalents)
  2. use static initialisation at compile time

In the latter case something like this would work:

__constant__ int constBuf[N] = { 16, 2, 77, 40, 12, 3, 5, 3, 6, 6 };

__global__ void foo( int *results )
{
    int tdx = threadIdx.x;
    int idx = blockIdx.x * blockDim.x + tdx;


    if( tdx < N )
    {
        results[idx] = constBuf[tdx]; // Note changes here!
    }
}


来源:https://stackoverflow.com/questions/24029679/using-constant-memory-prints-address-instead-of-value-in-cuda

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!