opencl

How to process string in opencl kernel from buffer of N fixed length strings?

人盡茶涼 提交于 2021-01-29 14:33:17
问题 I am required to process N fixed-length strings in parallel on an OpenCL device. Processing a string involves calling function that is provided, that takes a string as input represented as a buffer, and the length of the string in that buffer. void Function(const char *input_buffer, const int string_length, const char *output_buffer) Inside the host application I have concatenated the N strings into a large char buffer, with no separator between them. I would like to create a kernel with a

OpenCL clCreateContextFromType function results in memory leaks

旧街凉风 提交于 2021-01-29 12:16:56
问题 I ran valgrind to one of my open-source OpenCL codes (https://github.com/fangq/mmc), and it detected a lot of memory leaks in the OpenCL host code. Most of those pointed back to the line where I created the context object using clCreateContextFromType . I double checked all my OpenCL variables, command queues, kernels and programs, and made sure that they are all properly released, but still, when testing on sample programs, every call to the mmc_run_cl() function bumps up memory by 300MB

GPU with OpenCL is slower than CPU. Why?

こ雲淡風輕ζ 提交于 2021-01-29 09:08:18
问题 Environment: Intel i7-9750H Intel UHD Graphics 630 Nvidia GTX1050 (Laptop) Visual studio 2019 / C++ OpenCV 4.4 OpenCL 3.0 (intel) / 1.2 (nvidia) I'm trying to use OpenCL to speed up my code. But the result shows CPU is faster than GPU. How could I speed up my code? void GetHoughLines(cv::Mat dst) { cv::ocl::setUseOpenCL(true); int img_w = dst.size().width; // 5000 int img_h = dst.size().height; // 4000 cv::UMat tmp_dst = dst.getUMat(cv::ACCESS_READ); cv::UMat tmp_mat = cv::UMat(dst.size(), CV

Process strings form OpenCL kernel

喜欢而已 提交于 2021-01-29 07:22:46
问题 There are several strings like std::string first, second, third; ... My plan was to collect their addresses into a char* array: char *addresses = {&first[0], &second[0], &third[0]} ... and pass the char **addresses to the OpenCL kernel. There are several problems or questions: The main issue is that I cannot pass array of pointers. Is there any good way to use many-many strings from the kernel code without copying them but leave them in the shared memory? I'm using NVIDIA on Windows. So, I

OpenCL Kernel code compile error - Visual Studio 2019

巧了我就是萌 提交于 2021-01-29 07:21:36
问题 I'm kind of new to OpenCL programming and am trying to run a simple vector addition code in VS 2019. However, I can't get the .cl code to compile. It's showing these 6 errors when trying to build the program: Error C2144 syntax error: 'void' should be preceded by ';' Error C4430 missing type specifier - int assumed. Note: C++ does not support default-int Error C2065 '__global': undeclared identifier Error C2146 syntax error: missing ')' before identifier 'float4' Error C2143 syntax error:

PyopenCL 3D RGBA image from numpy array

若如初见. 提交于 2021-01-29 03:39:02
问题 I want to construct an OpenCL 3D RGBA image from a numpy array, using pyopencl. I know about the cl.image_from_array() function, that basically does exactly that, but doesn't give any control about command queues or events, that is exposed by cl.enqueue_copy() . So I really would like to use the latter function, to transfer a 3D RGBA image from host to device, but I seem to not being able getting the syntax of the image constructor right. So in this environment import pyopencl as cl import

OpenCL Kernel only partly writing to output buffer

泪湿孤枕 提交于 2021-01-28 14:20:34
问题 I am reading large integer values from an array that has over a million elements. The values obtained are from a wav file by using the libsndfile library. Now if I do not use the kernel, I can write the original array to my output file and listen to the audio with no issues. However, when i decide to use the kernel to do the exact same thing, it only writes maybe less than a second of the song. At first, I thought this was a memory issue, so i played around with the buffer sizes and still no

what is the content of cl_platform_id data structure?

◇◆丶佛笑我妖孽 提交于 2021-01-28 05:48:23
问题 I understand that cl_platform_id is a data structure like: typedef struct{ foo1 bar1; foo2 bar2; ...; }cl_platform_id; But what are the content of this structure? for example if I want to print these content to the console what data type should I use? I tried integer but I got the error: warning: format specifies type 'int' but the argument has type 'cl_platform_id' (aka 'struct _cl_platform_id *') [-Wformat] Thanks for your help in advance. 回答1: The cl_platform_id is an abstract (opaque)

How to reduce OpenCL enqueue time/any other ideas?

血红的双手。 提交于 2021-01-27 20:34:40
问题 I have an algorithm and I've been trying to accelerate it using OpenCL on my nVidia. It has to process a large amount of data (let's say 100k to milions), where for each one datum: a matrix (on the device) has to be updated first (using the datum and two vectors); and only after the whole matrix has been updated, the two vectors (also on the device) are updated using the same datum. So, my host code looks something like this for (int i = 0; i < milions; i++) { clSetKernelArg(kernel

Passing a function as an argument in OpenCL

谁都会走 提交于 2021-01-27 19:52:39
问题 Is it possible to pass a function pointer to a kernel in OpenCL 1.2? I know it can be done in C, but I don't know how to do it in OpenCL's C. Edit: I would like to do the same thing that is described in this post: How do you pass a function as a parameter in C?, but to a kernel. Previously, I have used inline functions to call them from a kernel, but I want the function to be a parameter instead of hard coded in. 回答1: Short: OpenCL's C != C, consider it as a syntactical help that most of it