opencl | 易学教程

How to get compute unit ID at runtime in OpenCL?

阅读更多关于 How to get compute unit ID at runtime in OpenCL?

问题 Is there a way to get the compute unit ID a work group is running on during runtime? I know that CUDA has some assembly code to do this. 回答1: No, there isn't a way to get the compute unit's ID. Your code should use the work group ID instead. What are you trying to achieve? I am a little surprised that CUDA supports this, please tell me which assembly code instruction does this. 来源： https://stackoverflow.com/questions/19547197/how-to-get-compute-unit-id-at-runtime-in-opencl

printf makes error and don't show the result

阅读更多关于 printf makes error and don't show the result

问题 i have problem with the printf in opencl this is the part of my code : clGetEventProfilingInfo(timing_event, CL_PROFILING_COMMAND_START, sizeof(time_start), &time_start, NULL); clGetEventProfilingInfo(timing_event, CL_PROFILING_COMMAND_END,sizeof(time_end), &time_end, NULL); total_time = time_end - time_start; printf("\nAverage Time In Nanoseconds = %lu\n" , total_time ); and i have declared variables like this : cl_event timing_event; cl_ulong time_start, time_end; cl_ulong total_time; but

opencl mapped memory doesn't work

阅读更多关于 opencl mapped memory doesn't work

问题 I try to implement memory mapped technics in my OpenCL program, but it's doesn't work! Here it's my kernel code: __kernel void update(__global char *in, __global char *out) { size_t i; for (i = 0; i < 10; i++); out[i] += 'A' - 'a'; } Here it's host code: cl_platform_id platformId = NULL; cl_device_id deviceId = NULL; cl_context context = NULL; cl_command_queue commandQueue = NULL; cl_mem cmPinnedBufIn = NULL; cl_mem cmPinnedBufOut = NULL; cl_mem cmDevBufIn = NULL; cl_mem cmDevBufOut = NULL;

How can I make IDCT run faster on my GPU?

阅读更多关于 How can I make IDCT run faster on my GPU?

问题 I am trying to optimize IDCT from this code for the GPU. The GPU I have on my system in NVIDIA Tesla k20c . The IDCT function as written in the original code looks like this: void IDCT(int32_t *input, uint8_t *output) { int32_t Y[64]; int32_t k, l; for (k = 0; k < 8; k++) { for (l = 0; l < 8; l++) Y(k, l) = SCALE(input[(k << 3) + l], S_BITS); idct_1d(&Y(k, 0)); } for (l = 0; l < 8; l++) { int32_t Yc[8]; for (k = 0; k < 8; k++) Yc[k] = Y(k, l); idct_1d(Yc); for (k = 0; k < 8; k++) { int32_t r

clGetDeviceInfo and clGetPlatformInfo fails in OpenCL with error code -30 (CL_INVALID_VALUE)

阅读更多关于 clGetDeviceInfo and clGetPlatformInfo fails in OpenCL with error code -30 (CL_INVALID_VALUE)

问题 I am starting to write a little "engine" for using OpenCL. Now, I encountered a problem that is quite strange. When I call clGetDeviceInfo() to query informations of the specific device, some of the options for the parameter param_name return the error code -30 ( = CL_INVALID_VALUE). A very famous one is the option CL_DEVICE_EXTENSIONS which should return me a string of extensions no matter what sdk or platform I am using. I checked every edge and also the parameters are double checked.

OpenCL Client Side Requirements

阅读更多关于 OpenCL Client Side Requirements

问题 I have implemented a project on my computer using AMD SDK v2.5 and ATI Catalyst drivers, as I have an ATI HD5570 graphics card. I would like my executable to run on a different platform. I would like to be able to check whether an available OpenCL platform can be found on the configuration my executable is run. And of course the configuration can have Nvidia graphics card. I have searched over internet but I couldn't find a final answer to my question. I am totally lost through my seach. Is

ocl-facedetect sample of opencv 2.4.6.1

阅读更多关于 ocl-facedetect sample of opencv 2.4.6.1

问题 ![enter image description here][1]On Ubuntu 12.04 LTS with NVidia GeForce 8 series GPU card, I am trying to run the ocl-facedetect sample of OpenCV 2.4.6.1 and seeing following error: $./ocl-example-facedetect -t haarcascade_frontalface_alt.xml -i friends.jpg In image read loop0 ~~~~ Loading convertC3C4 Building source:./convertC3C4_GeForce 8600 GT -D GENTYPE4=uchar4.clb ~~~~ Loading RGB2Gray Building source:./RGB2Gray_GeForce 8600 GT -D DEPTH_0.clb ~~~~ Loading resizeLN_C1_D0 Building source

Code terminates after saying COULD NOT CREATE KERNEL on Eclipse

阅读更多关于 Code terminates after saying COULD NOT CREATE KERNEL on Eclipse

问题 I am trying to translate a sequential C code for a MJPEG decoder into OpenCL. I got the C code from this github project. I am now trying to convert the original C code for IDCT into OpenCL. I copied and pasted the code from the .c file for IDCT and pasted into my .cl file which I named invCosine.cl . invCosine.cl : #define IDCT_INT_MIN (- IDCT_INT_MAX - 1) #define IDCT_INT_MAX 2147483647 /* * Useful constants: */ /* * ck = cos(k*pi/16) = s8-k = sin((8-k)*pi/16) times 1 << C_BITS and * rounded

declaring and defining pointer vetors of vectors in OpenCL Kernel

阅读更多关于 declaring and defining pointer vetors of vectors in OpenCL Kernel

问题 I have a variable which is vector of vector, And in c++, I am easily able to define and declare it but in OpenCL Kernel, I am facing the issues. Here is an example of what I am trying to do. std::vector<vector <double>> filter; for (int m= 0;m<3;m++) { const auto& w = filters[m]; -------sum operation using w } Now Here, I can easily referencing the values of filters[m] in w , but I am not able to do this OpenCl kernel file. Here is what I have tried,but it is giving me wrong output. In host

Why AMD GCN uses non-zero NULL?

阅读更多关于 Why AMD GCN uses non-zero NULL?

问题 This commit says: In amdgcn target, null pointers in global, constant, and generic address space take value 0 but null pointers in private and local address space take value -1. How do they use those two different values of NULL? 回答1: As to why : I don't know this for a fact, but local/private address space pointers are almost certainly just implemented as offsets/indices in a flat physical register file/memory area. There's no virtual memory-like address remapping, just a big array. You