opencl

How to pass vector parameter to OpenCL kernel in C?

只谈情不闲聊 提交于 2021-02-07 08:26:03
问题 I'm having trouble passing a vector type (uint8) parameter to an OpenCL kernel function from the host code in C. In the host I've got the data in an array: cl_uint dataArr[8] = { 1, 2, 3, 4, 5, 6, 7, 8 }; (My real data is more than just [1, 8]; this is just for ease of explanation.) I then transfer the data over to a buffer to be passed to the kernel: cl_mem kernelInputData = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(cl_uint)*8, dataArr, NULL); Next, I pass this

How to pass vector parameter to OpenCL kernel in C?

喜夏-厌秋 提交于 2021-02-07 08:24:05
问题 I'm having trouble passing a vector type (uint8) parameter to an OpenCL kernel function from the host code in C. In the host I've got the data in an array: cl_uint dataArr[8] = { 1, 2, 3, 4, 5, 6, 7, 8 }; (My real data is more than just [1, 8]; this is just for ease of explanation.) I then transfer the data over to a buffer to be passed to the kernel: cl_mem kernelInputData = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(cl_uint)*8, dataArr, NULL); Next, I pass this

Array size and copy performance

扶醉桌前 提交于 2021-02-07 03:18:31
问题 I'm sure this has been answered before, but I can't find a good explanation. I'm writing a graphics program where a part of the pipeline is copying voxel data to OpenCL page-locked (pinned) memory. I found that this copy procedure is a bottleneck and made some measurements on the performance of a simple std::copy . The data is floats, and every chunk of data that I want to copy is around 64 MB in size. This is my original code, before any attempts at benchmarking: std::copy(data, data

Array size and copy performance

馋奶兔 提交于 2021-02-07 03:14:41
问题 I'm sure this has been answered before, but I can't find a good explanation. I'm writing a graphics program where a part of the pipeline is copying voxel data to OpenCL page-locked (pinned) memory. I found that this copy procedure is a bottleneck and made some measurements on the performance of a simple std::copy . The data is floats, and every chunk of data that I want to copy is around 64 MB in size. This is my original code, before any attempts at benchmarking: std::copy(data, data

OpenCL GPU Audio

天大地大妈咪最大 提交于 2021-02-05 16:43:35
问题 There's not much on this subject, perhaps because it isn't a good idea in the first place. I want to create a realtime audio synthesis/processing engine that runs on the GPU. The reason for this is because I will also be using a physics library that runs on the GPU, and the audio output will be determined by the physics state. Is it true that GPU only carries audio output and can't generate it? Would this mean a large increase in latency, if I were to read the data back on the CPU and output

What is the difference between OpenCL and OpenGL's compute shader?

可紊 提交于 2021-02-05 12:54:07
问题 I know OpenCL gives control of the GPU's memory architecture and thus allows better optimization, but, leaving this aside, can we use Compute Shaders for vector operations (addition, multiplication, inversion, etc.)? 回答1: In contrast to the other OpenGL shader types, compute shaders are not directly related to computer graphics and provide a much more direct abstraction of the underlying hardware, similar to CUDA and OpenCL. It provides customizable work group size, shared memory, intra-group

PyOpenCL: how to create a local memory buffer?

蓝咒 提交于 2021-02-05 07:36:59
问题 Probably extremely simple question here, but I've been searching for it for hours with nothing to show for. I have this piece of code, I'd like to have a 256-bit (8 uint32) bitstring_gpu as a localmemory pointer in the device: def Get_Bitstring_GPU_Buffer(ctx, bitstring): bitstring_gpu = cl.Buffer(ctx, mem_flags.READ_ONLY | mem_flags.COPY_HOST_PTR, hostbuf=bitstring) return bitstring_gpu This is later used on a kernel call: prg.get_active_hard_locations_64bit(queue, (HARD_LOCATIONS,), None,

Char*** in OpenCL kernel argument?

纵然是瞬间 提交于 2021-02-04 06:25:29
问题 I need to pass a vector<vector<string>> to a kernel OpenCL. What is the easiest way of doing it? Passing a char*** gives me an error: __kernel void vadd( __global char*** sets, __global int* m, __global long* result) {} ERROR: clBuildProgram(CL_BUILD_PROGRAM_FAILURE) 回答1: In OpenCL 1.x, this sort of thing is basically not possible. You'll need to convert your data such that it fits into a single buffer object, or at least into a fixed number of buffer objects. Pointers on the host don't make

OpenCL CLK_LOCAL_MEM_FENCE causing abort trap 6

醉酒当歌 提交于 2021-01-29 22:03:23
问题 I'm doing some exercise about convolution over images (info here) using OpenCL. When I use images whose size is not a square (like r x c) CLK_LOCAL_MEM_FENCE makes the program stop with abort trap 6. What I do is essentially filing up the local memory with proper values, waiting for this process of filling the local memory to finish, using barrier( CLK_LOCAL_MEM_FENCE ) and then calculating the values. It seems like when I use images like those I've told you about barrier( CLK_LOCAL_MEM_FENCE

OpenCL clCreateContextFromType function results in memory leaks

限于喜欢 提交于 2021-01-29 18:39:50
问题 I ran valgrind to one of my open-source OpenCL codes (https://github.com/fangq/mmc), and it detected a lot of memory leaks in the OpenCL host code. Most of those pointed back to the line where I created the context object using clCreateContextFromType . I double checked all my OpenCL variables, command queues, kernels and programs, and made sure that they are all properly released, but still, when testing on sample programs, every call to the mmc_run_cl() function bumps up memory by 300MB