clEnqueueNDRangeKernel blocks execution

☆樱花仙子☆ 提交于 2019-12-13 05:17:29

问题


Another question for me now. I've been trying to analyze the results of my kernel parallel to its execution while it's broken up to multiple calls. However, while clEnqueueReadBuffer has a boolean to determine whether it blocks or not, clEnqueueNDRangeKernel has none and I had assumed it was async always (It is being "enqueued" afterall which makes me assume that it would act like a task queue). However, when I run this block of code the outer code doesn't get executed until the kernel has been finished completely (I am not explicitly calling clFinish or anything like that would cause this behavior).

I'm running the kernel on an NVidia GPU. So why is this segment of code blocking and what could I do to remedy it within OpenCL? Otherwise, I'm considering running a separate thread solely to "enqueue" these kernel commands to the queue.

const size_t amountPerGo = multipleRoundUp(local_ws, (int)(50000));
//Finds the smallest multiple of local worksize that greater than the 50000 segment

std::cout << "Launch" << std::endl;

for( int j = 0; j < 10; j++ ) //Make the effects more extreme
{
    for( size_t i = 0; i < dimensions.x*dimensions.y; i+= amountPerGo )
    {
        clSetKernelArg(rayKernel, 6, sizeof(int), &i);
        std::cout << "sub" << std::endl;

        error = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &amountPerGo, &local_ws, 0, NULL, NULL);

        // Reading back
        clEnqueueReadBuffer(queue, outResult, CL_FALSE, sizeof(vec4)*i, sizeof(vec4)*(amountPerGo), resultSet+i, 0, NULL, NULL);
    }
}

std::cout << "End launch Start" << std::endl;

回答1:


There is possiblity of simultaneous execution of OpenCL kernel & arguments setting. Try to use different kernel objects.



来源:https://stackoverflow.com/questions/25437554/clenqueuendrangekernel-blocks-execution

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!