opencl

Why is there a CL_DEVICE_MAX_WORK_GROUP_SIZE?

岁酱吖の 提交于 2019-12-08 20:31:26
问题 I'm trying to understand the architecture of OpenCL devices such as GPUs, and I fail to see why there is an explicit bound on the number of work items in a local work group, i.e. the constant CL_DEVICE_MAX_WORK_GROUP_SIZE. It seems to me that this should be taken care of by the compiler, i.e. if a (one-dimensional for simplicity) kernel is executed with local workgroup size 500 while its physical maximum is 100, and the kernel looks for example like this: __kernel void test(float* input) { i

Why program (global) scope variables must be __constant?

陌路散爱 提交于 2019-12-08 19:31:56
问题 I am new to OpenCL and really confused by this restriction. For example, if I want to write a LCG, I have to make the state word be modifiable to both rand() and srand() . In ANSI C, I will do that with something like: /* ANSI C */ static unsigned long _holdrand = 1; /* Global! */ unsigned long rand(){ _holdrand = _holdrand * 214013L + 2531011L; return (_holdrand >> 16) & 0x7FFF; } void srand( unsigned long seed ){ _holdrand = seed; } But OpenCL restrict all global scope variables being _

Does opencl support Function Pointers?

一笑奈何 提交于 2019-12-08 16:15:50
问题 Is there a way to use C style function pointers in OpenCL? In other words, I'd like to fill out a OpenCL struct with several values, as well as pointers to a OpenCL function. I'm not talking about going from a CPU function to a GPU function, I'm talking about going from a GPU function to a GPU function. Is this possible? --- EDIT --- If not, it there a way around this? In CUDA we have object inheritance, and in 4.0 we even have virtual functions. About the only way I can find to implement a

Base Address of Memory Object OpenCL

梦想的初衷 提交于 2019-12-08 13:17:14
问题 I want to traverse a tree at GPU with OpenCL , so i assemble the tree in a contiguous block at host and i change the addresses of all pointers so as to be consistent at device as follows: TreeAddressDevice = (size_t)BaseAddressDevice + ((size_t)TreeAddressHost - (size_t)BaseAddressHost); I want the base address of the memory buffer : At host i allocate memory for the buffer, as follows: cl_mem tree_d = clCreateBuffer(...); The problem is that cl_mems are objects that track an internal

What is the correct way to get OpenCL to play nice with OpenGL in Qt5?

喜欢而已 提交于 2019-12-08 13:02:24
问题 I have found several unofficial sources for how to get OpenCL to play nice with OpenGL and Qt5, each with different levels of complexity: https://github.com/smistad/Qt-OpenGL-OpenCL-Interoperability https://github.com/petoknm/QtOpenCLGLInterop http://www.krazer.com/?p=109 Having these examples is nice, however they don't answer the following question: What exact steps are the minimum required to have a Qt5 widgets program display the result of a calculation made in OpenCL kernel and then

OpenCL files fail to compile on OS X

南楼画角 提交于 2019-12-08 12:50:45
问题 I have a quite large opencl file that compiles fine on both Windows and Linux Ubuntu but fails on MacOSX. The cvmcompiler process uses 100% of the CPU and never completes. The full code of the project is there: https://github.com/favreau/Sol-R and the file in question is: https://github.com/favreau/Sol-R/blob/master/solr/engines/opencl/RayTracer.cl The problem should be fairly easy to reproduce by cloning the project and running the cmake/make process. Note that since OpenCL is compiled at

Adressing vector elements in C / openCL

走远了吗. 提交于 2019-12-08 12:37:05
问题 I'm writing an openCL Kernel in pyopenCL, where I want to address vector elements. In plain C, the result I want to have is: int i = 0; float *vec = (float*)maalloc(sizeof(float)*4); for (i=0;i<4;i++) { vec[i]=2*i; } In openCL, the elements of a vector are accessed in a "pythonic" point-syntax style. float4 vec = (float4)(0); for (i=0;i<4,i++) { vec.si = 2*i; /*obviously doesn't work*/ } So vec[2] becomes vec.s2 in openCL, so it is no longer straightforward to access the element with a

Correctly using mutex in OpenCL-OpenCV-Realtime-Threads?

房东的猫 提交于 2019-12-08 12:18:27
问题 Im trying to get a stereo-videostream in realtime via usb-webcams in a GPU-Thread (way faster than to get and process the images via cpu), processing said stream in a second thread to get the faces and control the threads via keyboard in the main-function (will be implemented later). At the moment the code runs properly (shows both Links/Rechts and draws a rectangle around my face) for ~30s and then crashes because of an "...unhandled exception (opencv_core249d.dll)". Ive tried using mutex

How do you flatten image coordinates into a 1D array?

空扰寡人 提交于 2019-12-08 12:04:10
问题 My source code is from Heterogeneous Computing with OpenCL Chapter 4 Basic OpenCL Examples > Image Rotation. The book leaves out several critical details. My major problem is that I don't know how to initialize the array that I supply to their kernel (they don't tell you how). What I have is: int W = inImage.width(); int H = inImage.height(); float *myImage = new float[W*H]; for(int row = 0; row < H; row++) for(int col = 0; col < W; col++) myImage[row*W+col] = col; which I supply to this

OpenCL SDK overview and hardware interoperability

荒凉一梦 提交于 2019-12-08 10:13:31
问题 I am a little bit confused of the overall situation when it comes to OpenCL development so I'll just state my current understanding and questions as a list. Please correct me if I'm wrong. I know there are SDKs ("Platforms") by Intel, AMD (and I guess there is also OpenCL support in the Nvidia SDK?) Are there SDKs by other vendors? Will the SDK of one vendor support the devices of another? e.g. Nvidia devices with AMD sdk? I am able to run programs on my Intel CPU using AMD SDK. Is it the way