opencl

Intel CPU OpenCL in Mono killed by SIGXCPU (Ubuntu)

白昼怎懂夜的黑 提交于 2020-01-24 19:54:27
问题 Some time ago I wrote simple boids simulation using OpenCL (was school assignment), using C#, Cloo for OpenCL and OpenTK for OpenGL output. I tested it on Windows7 with AMD CPU implementation of OpenCL and on friend's NVidia. Now I tried it on Linux (Ubuntu 12.04). I installed amd app sdk and intel sdk. It compiled ok, reference CPU implementation is working fine with graphic output. But when I try to run OpenCL version, it runs for about 1 second (showing what seems like valid output in

OpenCL : Querying max clock frequency of a mobile GPU always returns a lesser value

牧云@^-^@ 提交于 2020-01-24 15:48:08
问题 In order to know the max clock frequency of a Mali T760 GPU, I used the code snippet below: // Get device max clock frequency cl_uint max_clock_freq; err_num = clGetDeviceInfo(cl_devices[device_idx], CL_DEVICE_MAX_CLOCK_FREQUENCY, sizeof(max_clock_freq), &max_clock_freq, NULL); check_cl_error(err_num, "clGetDeviceInfo: Getting device max clock frequency"); printf("CL_DEVICE_MAX_CLOCK_FREQUENCY: %d MHz\n", max_clock_freq); Full source code available here: https://github.com/sivagnanamn/opencl

Time measuring in PyOpenCL

ぐ巨炮叔叔 提交于 2020-01-23 16:46:26
问题 I am running a kernel using PyOpenCL in a FPGA and in a GPU. In order to measure the time it takes to execute I use: t1 = time() event = mykernel(queue, (c_width, c_height), (block_size, block_size), d_c_buf, d_a_buf, d_b_buf, a_width, b_width) event.wait() t2 = time() compute_time = t2-t1 compute_time_e = (event.profile.end-event.profile.start)*1e-9 This provides me the execution time from the point of view of the host (compute_time) and from the device (compute_time_e). The problem is that

Is the access performance of __constant memory as same as __global memory on OpenCL

大城市里の小女人 提交于 2020-01-21 12:53:13
问题 As I know. Constant memory on CUDA is a specific memory. And it is faster than global memory. But in OpenCL's Spec. I get the following words. The __constant or constant address space name is used to describe variables allocated in global memory and which are accessed inside a kernel(s) as read-only variables So the __constant memory is from the __global memory. Does that mean it have the same accessing performance with the __global memory? 回答1: It depends on the hardware and software

OpenCL介绍

你。 提交于 2020-01-19 13:59:45
  OpenCL(全称Open Computing Language,开放运算语言)是第一个面向异构系统通用目的并行编程的开放式、免费标准,也是一个统一的编程环境,便于软件开发人员为高性能计算服务器、桌面计算系统、手持设备编写高效轻便的代码,而且广泛适用于多核心处理器(CPU)、图形处理器(GPU)、Cell类型架构以及数字信号处理器(DSP)等其他并行处理器,在游戏、娱乐、科研、医疗等各种领域都有广阔的发展前景。 基本信息   OpenCL是一个为异构平台编写程序的框架,此异构平台可由CPU,GPU或其他类型的处理器组成。OpenCL由一门用于编写kernels (在OpenCL设备上运行的函数)的语言(基于C99)和一组用于定义并控制平台的API组成。OpenCL提供了基于任务分割和数据分割的并行计算机制。   OpenCL类似于另外两个开放的工业标准OpenGL和OpenAL,这两个标准分别用于三维图形和计算机音频方面。OpenCL扩展了GPU用于图形生成之外的能力。OpenCL由非盈利性技术组织Khronos Group掌管。 历史发展   OpenCL最初苹果公司开发,拥有其商标权,并在与AMD,IBM,英特尔和NVIDIA技术团队的合作之下初步完善。随后,苹果将这一草案提交至Khronos Group。   2008年6月的WWDC大会上,苹果提出了OpenCL规范

RK3288 OpenCL 打印platform 和device 信息

ε祈祈猫儿з 提交于 2020-01-18 04:44:12
准备工作: 1)下载头文件: https://github.com/KhronosGroup/OpenCL-Headers/tree/master/CL 2)从RK3288 android 系统路径 /system/vendor/lib/egl 中到处库文件 libGLES_mali.so 1.打印platform 信息; 2.打印device 信息; 3.打印总是工作项目; void print_openCL_platform_device() { int i, j; char info[1024]; cl_int err; cl_uint nPlatform; cl_platform_id *listPlatform; cl_uint nDevice; cl_device_id *listDevice; cl_uint nMaxComputeUnits = 0; cl_uint nMaxWorkItemDims = 0; size_t *nMaxWorkItemSizes = NULL; size_t nMaxGlobalWorkSize = 1; size_t nMaxWorkGroupSize = 0; err = clGetPlatformIDs(0, NULL, &nPlatform); if(err < 0) { perror("Couldn't find any

getting cl_build_program_failure error

北慕城南 提交于 2020-01-17 12:25:26
问题 I am currently working on a project about OpenCL and ran into some troubles when I was trying to build the program. So I have the following code: //Read source file std::ifstream sourceFile("calculation_kernel.cl"); std::string sourceCode(std::istreambuf_iterator<char>(sourceFile), (std::istreambuf_iterator<char>())); cl::Program::Sources source(1, std::make_pair(sourceCode.c_str(), sourceCode.length()+1)); if (sourceFile.is_open()){ printf("the file is open\n"); }else{ printf("error opening

Access/synchronization to local memory

☆樱花仙子☆ 提交于 2020-01-17 06:22:29
问题 I'm pretty new to GPGPU programming. I'm trying to implement algorithm that needs lot of synchronization, so its using only one work-group (global and local size have the same value) I have fallowing problem: my program is working correctly till size of problem exceeds 32. __kernel void assort( __global float *array, __local float *currentOutput, __local float *stimulations, __local int *noOfValuesAdded, __local float *addedValue, __local float *positionToInsert, __local int *activatedIdx, _

Access/synchronization to local memory

狂风中的少年 提交于 2020-01-17 06:22:00
问题 I'm pretty new to GPGPU programming. I'm trying to implement algorithm that needs lot of synchronization, so its using only one work-group (global and local size have the same value) I have fallowing problem: my program is working correctly till size of problem exceeds 32. __kernel void assort( __global float *array, __local float *currentOutput, __local float *stimulations, __local int *noOfValuesAdded, __local float *addedValue, __local float *positionToInsert, __local int *activatedIdx, _

running openCL in android

筅森魡賤 提交于 2020-01-16 09:10:15
问题 Well there are many tutorials and post about this, but I am not getting exactly how to deal with libOpenCL.so file. Many vendors does not include it inside phone, but my app needs to support maximum available phones today, so do I need to get compatible libOpenCL.so file for each of them? 回答1: OpenCL is not officially supported by Android Open Source Project See: Why did Google choose RenderScript instead of OpenCL However it appears that Device Manufacturers are including support by adding