CUDA, NPP Filters

爱⌒轻易说出口 提交于 2019-12-06 05:11:59

问题


The CUDA NPP library supports filtering of image using the nppiFilter_8u_C1R command but keep getting errors. I have no problem getting the boxFilterNPP sample code up and running.

eStatusNPP = nppiFilterBox_8u_C1R(oDeviceSrc.data(), oDeviceSrc.pitch(), 
                                  oDeviceDst.data(), oDeviceDst.pitch(), 
                                  oSizeROI, oMaskSize, oAnchor);

But if I change it to use nppiFilter_8u_C1R instead, eStatusNPP return the error -24 (NPP_TEXTURE_BIND_ERROR). The code below is the alterations I made to the original boxFilterNPP sample.

NppiSize oMaskSize = {5,5};
npp::ImageCPU_32s_C1 hostKernel(5,5);

for(int x = 0 ; x < 5; x++){
    for(int y = 0 ; y < 5; y++){
        hostKernel.pixels(x,y)[0].x = 1;
    }
}

npp::ImageNPP_32s_C1 pKernel(hostKernel);

Npp32s nDivisor = 1;

eStatusNPP = nppiFilter_8u_C1R(oDeviceSrc.data(), oDeviceSrc.pitch(), 
                               oDeviceDst.data(), oDeviceDst.pitch(), 
                               oSizeROI, 
                               pKernel.data(),
                               oMaskSize, oAnchor,
                               nDivisor);

This have been tried on CUDA 4.2 and 5.0, with same result.

The code runs with the expected result when oMaskSize = {1,1}


回答1:


I had the same problem when I stored my kernel as an ImageCPU/ImageNPP.

A good solution is to store the kernel as a traditional 1D array on the device. I tried this, and it gave me good results (and none of those unpredictable or garbage images).

Thanks to Frank Jargstorff in this StackOverflow post for the 1D idea.

NppiSize oMaskSize = {5,5};
Npp32s hostKernel[5*5];

for(int x = 0 ; x < 5; x++){
    for(int y = 0 ; y < 5; y++){
        hostKernel[x*5+y] = 1;
    }
}

Npp32s* pKernel; //just a regular 1D array on the GPU
cudaMalloc((void**)&pKernel, 5 * 5 * sizeof(Npp32s));
cudaMemcpy(pKernel, hostKernel, 5 * 5 * sizeof(Npp32s), cudaMemcpyHostToDevice);

Using this original image, here's the blurred result that I get from your code with the 1D kernel array:

Other parameters that I used:

Npp32s nDivisor = 25;
NppiPoint oAnchor = {4, 4};



回答2:


Filter applies the mask extending upward and to the left, following the mathematical convention that the convolution between two functions reverses the direction of the second function.

The box filter mask extends downwards and to the right, which is probably more intuitive.

In any case, the problem is caused by the fact that the input image in the changed code would have to be sampled at what would effectively be SOURCE[-4, -4) in order to compute DESTINATION[0, 0]. Since the input image is being accessed via a texture sampler, binding the source image pointer offset by (-4, -4) causes the texture-bind error you're seeing.

Workaround: The simplest workaround for this issue would be to set the anchor point to (4, 4), which would effectively move the mask down and to the right. You still need to be aware that you'd want to invert the weights in the kernel array (i.e. K[-4, -4] -> K[0, 0], K[0, 0] -> K[-4, -4], etc.).




回答3:


Thank you for the help. Got over the error, but I'm seeing some odd behavior. The image changes depending on what program I run just before and the image do not show what i am going fore.

The example that I am trying to mimic is the nppiFilterBox_8u_C1R with the use of nppiFilter_8u_C1R where i set the kernel to ones and the nDivisor to the sum of the kernel.

This code is still a alteration on the boxFilterNPP sample code.

NppiSize oMaskSize = {5,5};
npp::ImageCPU_32s_C1 hostKernel(5,5);
for(int x = 0 ; x < 5; x++){
    for(int y = 0 ; y < 5; y++){
        hostKernel.pixels(x,y)[0].x = 1;
    }
}

npp::ImageNPP_32s_C1 pKernel(hostKernel);
Npp32s nDivisor = 25;
NppiPoint oAnchor = {4, 4};
eStatusNPP = nppiFilter_8u_C1R(oDeviceSrc.data(),oDeviceSrc.pitch(), 
                               oDeviceDst.data(), oDeviceDst.pitch(), 
                               oSizeROI, 
                               pKernel.data(),
                               oMaskSize, oAnchor,
                               nDivisor);

Since the kernel is only ones the need to invert the weights should not be a issue.

The 5 different kinds of image this code return are show below. Mostly the last one is returned.

http://1ordrup.dk/kasper/image/Lena_boxFilter1.jpg
http://1ordrup.dk/kasper/image/Lena_boxFilter2.jpg
http://1ordrup.dk/kasper/image/Lena_boxFilter3.jpg
http://1ordrup.dk/kasper/image/Lena_boxFilter4.jpg 
http://1ordrup.dk/kasper/image/Lena_boxFilter5.jpg

I think the reason this happens is that the kernel is not initilised correctly or no used, thus data with pseudo-random content is used for the kernel.



来源:https://stackoverflow.com/questions/12778463/cuda-npp-filters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!