opencl

OpenCL, C++: Unexpected Results of simple sum float vector program

徘徊边缘 提交于 2019-12-25 10:18:07
问题 It is simple program that read two float4 vectors from files then calculate sum of opposite numbers. The Result of it were not expected!! The main File: #include <limits.h> #include <stdio.h> #include <stdlib.h> #include <iostream> #include <iomanip> #include <array> #include <fstream> #include <sstream> #include <string> #include <algorithm> #include <iterator> #ifdef __APPLE__ #include <OpenCL/opencl.h> #else #include <CL/cl.h> #include <time.h> #endif const int number_of_points = 16; //

EXCEPTION_ACCESS_VIOLATION (0xc0000005) when trying to free memory

家住魔仙堡 提交于 2019-12-25 07:35:20
问题 When using opencl via LWJGL, I am getting the following error message: # # A fatal error has been detected by the Java Runtime Environment: # # EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x0000000002201971, pid=8476, tid=8920 # # JRE version: 7.0_03-b05 # Java VM: Java HotSpot(TM) 64-Bit Server VM (22.1-b02 mixed mode windows-amd64 compressed oops) # Problematic frame: # C [OpenCL.dll+0x1971] # # Failed to write core dump. Minidumps are not enabled by default on client versions of Windows

Can I define a type alias depending on runtime conditions in C++?

大城市里の小女人 提交于 2019-12-25 06:34:54
问题 I am working on an OpenCL Program which uses the "cl_khr_byte_addressable_store" extension to allow for byte level writes in Kernel code; in order for my program to also work on GPUs which do not support the extension, I need to check whether the extension is available and use an int type instead of char on platforms on which it is not. In the kernel code, I can simply add the following snippet and use writeable_type throughout the my kernel code; the preprocessor definition cl_khr_byte

Shortest paths by BFS, porting a code from CUDA to openCL

折月煮酒 提交于 2019-12-25 06:27:45
问题 I am currently porting a CUDA code that finds shortest paths from each node to other nodes in a (undirected) graph. So basically, the CUDA code constructs a graph read from a text file. Then it proceeds to build adjancent arrays h_v and h_e. For example A B A C B C gives h_v[0] = 0, h_e[0]=1 h_v[1] = 0, h_e[1]=2 h_v[2] = 1, h_e[2]=2 Then it calls the kernel to compute shortest paths from each node using BFS. The cuda host code is as follow: int cc_bfs(int n_count, int e_count, int *h_v, int

OpenCL device memory read/write issue

萝らか妹 提交于 2019-12-25 02:55:59
问题 I am using TI's Keystone II which has ARM as host and 8 accelerator DSP cores. These DSP cores don't talk to each other as they do not have any shared memory with them. I am getting this strange issue that I am unable to rewrite into this 'cum' array in which I am computing the cumulative frequency. I am only able to read whatever I wrote to it the first time. The writes after that are not registered. Any solutions to this issue? The device has a Unified Memory architecture. Also 'cum' and

Display kernel error

随声附和 提交于 2019-12-25 01:49:57
问题 I'm using GCC and the NVIDIA implementation of OpenCL, and online compilation instead of offline compilation. I use this list to check which is the error I have. But nevertheless if I have an error inside my kernel the only information I have is an error value -48. My question is: Is there a way to display the exact kernel compilation error? If a semicolon is missing, or I have a wild pointer I would like to read so, instead of just a -48 error. Otherwise the development time is getting too

Parallel execution is not performed OpenCL MQL5

偶尔善良 提交于 2019-12-25 01:06:28
问题 I have created a kernel of the OpenCL in Mql5. Here is the code: const string cl_src = //" int weightsum; \r\n" " #pragma OPENCL EXTENSION cl_khr_fp64 : enable \r\n" "__kernel void CalculateSimpleMA( \r\n" "int rates_total, \r\n" "int prev_calculated, \r\n" "int begin, \r\n" "int InpMAPeriod, \r\n" "__global double *price, \r\n" "__global double *ExtLineBuffer \r\n" ") \r\n" "{ \r\n" "int i,limit; \r\n" "int len_price = get_global_id(4); \r\n" //"int len_Ext = get_global_id(5); \r\n" " if

Most efficient way of converting byte array to vector

社会主义新天地 提交于 2019-12-24 23:11:25
问题 What is the most efficient way of converting an array of 16 bytes into a uint4 vector ? currently, I manually OR the bytes into uints, then set the vector's components with the completed uints. Is there OpenCL support for performing this task? This is for OpenCL 1.2 Edit: here is my code: local uchar buffer[16]; uint v[4]; for (int i = 0; i < 4; ++i) { v[i]=0; for (int j = 0; j < 4; ++j) { v[i] |= (buffer[(i<<2)+j]) << (j<<3); } } uint4 result = (uint4)(v[0],v[1],v[2],v[3]); Edit 2: buffer is

OpenCL Kernel wait/delay

萝らか妹 提交于 2019-12-24 22:26:36
问题 I'am new to the OpenCL. How can i make a delay in OpenCL Kernel script without making loops? I have a code that's in some circumstances needs to wait for some time and then resume execution like so __kernel void test(uint4 value,uint4 delay) { uint id = get_global_id(0); //some code for(uint i=0;i<delay;i++) { //... do nothing like this? } } But i suppose that the loop will make gpu busy as hell, is there something i can use like sleep maybe in the kernel CL? I looked up in the sdk

How to check if the system has AMD or NVIDIA in C#?

霸气de小男生 提交于 2019-12-24 20:29:17
问题 I'm trying to make an Ethereum mining client using C#, and I need to check whether the system has AMD or NVIDIA. This is because the program needs to know whether it should mine Ethereum via CUDA or OpenCL. 回答1: You need to use System.Management Namespace (You can find under references/Assemblies) After adding namespace you need to navigate all properties of ManagementObject and navigate all properties of propertydata till founding description on name property. I wrote this solution for