opencl | 易学教程

How to process string in opencl kernel from buffer of N fixed length strings?

阅读更多关于 How to process string in opencl kernel from buffer of N fixed length strings?

问题 I am required to process N fixed-length strings in parallel on an OpenCL device. Processing a string involves calling function that is provided, that takes a string as input represented as a buffer, and the length of the string in that buffer. void Function(const char *input_buffer, const int string_length, const char *output_buffer) Inside the host application I have concatenated the N strings into a large char buffer, with no separator between them. I would like to create a kernel with a

OpenCL clCreateContextFromType function results in memory leaks

阅读更多关于 OpenCL clCreateContextFromType function results in memory leaks

问题 I ran valgrind to one of my open-source OpenCL codes (https://github.com/fangq/mmc), and it detected a lot of memory leaks in the OpenCL host code. Most of those pointed back to the line where I created the context object using clCreateContextFromType . I double checked all my OpenCL variables, command queues, kernels and programs, and made sure that they are all properly released, but still, when testing on sample programs, every call to the mmc_run_cl() function bumps up memory by 300MB

GPU with OpenCL is slower than CPU. Why?

阅读更多关于 GPU with OpenCL is slower than CPU. Why?

问题 Environment: Intel i7-9750H Intel UHD Graphics 630 Nvidia GTX1050 (Laptop) Visual studio 2019 / C++ OpenCV 4.4 OpenCL 3.0 (intel) / 1.2 (nvidia) I'm trying to use OpenCL to speed up my code. But the result shows CPU is faster than GPU. How could I speed up my code? void GetHoughLines(cv::Mat dst) { cv::ocl::setUseOpenCL(true); int img_w = dst.size().width; // 5000 int img_h = dst.size().height; // 4000 cv::UMat tmp_dst = dst.getUMat(cv::ACCESS_READ); cv::UMat tmp_mat = cv::UMat(dst.size(), CV

Process strings form OpenCL kernel

阅读更多关于 Process strings form OpenCL kernel

问题 There are several strings like std::string first, second, third; ... My plan was to collect their addresses into a char* array: char *addresses = {&first[0], &second[0], &third[0]} ... and pass the char **addresses to the OpenCL kernel. There are several problems or questions: The main issue is that I cannot pass array of pointers. Is there any good way to use many-many strings from the kernel code without copying them but leave them in the shared memory? I'm using NVIDIA on Windows. So, I

OpenCL Kernel code compile error - Visual Studio 2019

阅读更多关于 OpenCL Kernel code compile error - Visual Studio 2019

问题 I'm kind of new to OpenCL programming and am trying to run a simple vector addition code in VS 2019. However, I can't get the .cl code to compile. It's showing these 6 errors when trying to build the program: Error C2144 syntax error: 'void' should be preceded by ';' Error C4430 missing type specifier - int assumed. Note: C++ does not support default-int Error C2065 '__global': undeclared identifier Error C2146 syntax error: missing ')' before identifier 'float4' Error C2143 syntax error:

PyopenCL 3D RGBA image from numpy array

阅读更多关于 PyopenCL 3D RGBA image from numpy array

问题 I want to construct an OpenCL 3D RGBA image from a numpy array, using pyopencl. I know about the cl.image_from_array() function, that basically does exactly that, but doesn't give any control about command queues or events, that is exposed by cl.enqueue_copy() . So I really would like to use the latter function, to transfer a 3D RGBA image from host to device, but I seem to not being able getting the syntax of the image constructor right. So in this environment import pyopencl as cl import

OpenCL Kernel only partly writing to output buffer

阅读更多关于 OpenCL Kernel only partly writing to output buffer

问题 I am reading large integer values from an array that has over a million elements. The values obtained are from a wav file by using the libsndfile library. Now if I do not use the kernel, I can write the original array to my output file and listen to the audio with no issues. However, when i decide to use the kernel to do the exact same thing, it only writes maybe less than a second of the song. At first, I thought this was a memory issue, so i played around with the buffer sizes and still no

what is the content of cl_platform_id data structure?

阅读更多关于 what is the content of cl_platform_id data structure?

问题 I understand that cl_platform_id is a data structure like: typedef struct{ foo1 bar1; foo2 bar2; ...; }cl_platform_id; But what are the content of this structure? for example if I want to print these content to the console what data type should I use? I tried integer but I got the error: warning: format specifies type 'int' but the argument has type 'cl_platform_id' (aka 'struct _cl_platform_id *') [-Wformat] Thanks for your help in advance. 回答1: The cl_platform_id is an abstract (opaque)

How to reduce OpenCL enqueue time/any other ideas?

阅读更多关于 How to reduce OpenCL enqueue time/any other ideas?

问题 I have an algorithm and I've been trying to accelerate it using OpenCL on my nVidia. It has to process a large amount of data (let's say 100k to milions), where for each one datum: a matrix (on the device) has to be updated first (using the datum and two vectors); and only after the whole matrix has been updated, the two vectors (also on the device) are updated using the same datum. So, my host code looks something like this for (int i = 0; i < milions; i++) { clSetKernelArg(kernel

Passing a function as an argument in OpenCL

阅读更多关于 Passing a function as an argument in OpenCL

问题 Is it possible to pass a function pointer to a kernel in OpenCL 1.2? I know it can be done in C, but I don't know how to do it in OpenCL's C. Edit: I would like to do the same thing that is described in this post: How do you pass a function as a parameter in C?, but to a kernel. Previously, I have used inline functions to call them from a kernel, but I want the function to be a parameter instead of hard coded in. 回答1: Short: OpenCL's C != C, consider it as a syntactical help that most of it