pyopencl

pyopenCL, openCL, Can't build program on GPU

喜你入骨 提交于 2021-02-10 20:21:58
问题 I have a piece of kernel source which runs on the G970 on my PC but won't compile on my early 2015 MacBook pro with Iris 6100 1536MB graphic. platform = cl.get_platforms()[0] device = platform.get_devices()[1] # Get the GPU ID ctx = cl.Context([device]) # Tell CL to use GPU queue = cl.CommandQueue(ctx) # Create a command queue for the target device. # program = cl.Program(ctx, kernelsource).build() print platform.get_devices() This get_devices() show I have 'Intel(R) Core(TM) i5-5287U CPU @ 2

Getting started with shared memory on PyCUDA

大城市里の小女人 提交于 2021-02-08 10:35:59
问题 I'm trying to understand shared memory by playing with the following code: import pycuda.driver as drv import pycuda.tools import pycuda.autoinit import numpy from pycuda.compiler import SourceModule src=''' __global__ void reduce0(float *g_idata, float *g_odata) { extern __shared__ float sdata[]; // each thread loads one element from global to shared mem unsigned int tid = threadIdx.x; unsigned int i = blockIdx.x*blockDim.x + threadIdx.x; sdata[tid] = g_idata[i]; __syncthreads(); // do

PyOpenCL: how to create a local memory buffer?

蓝咒 提交于 2021-02-05 07:36:59
问题 Probably extremely simple question here, but I've been searching for it for hours with nothing to show for. I have this piece of code, I'd like to have a 256-bit (8 uint32) bitstring_gpu as a localmemory pointer in the device: def Get_Bitstring_GPU_Buffer(ctx, bitstring): bitstring_gpu = cl.Buffer(ctx, mem_flags.READ_ONLY | mem_flags.COPY_HOST_PTR, hostbuf=bitstring) return bitstring_gpu This is later used on a kernel call: prg.get_active_hard_locations_64bit(queue, (HARD_LOCATIONS,), None,

PyopenCL 3D RGBA image from numpy array

若如初见. 提交于 2021-01-29 03:39:02
问题 I want to construct an OpenCL 3D RGBA image from a numpy array, using pyopencl. I know about the cl.image_from_array() function, that basically does exactly that, but doesn't give any control about command queues or events, that is exposed by cl.enqueue_copy() . So I really would like to use the latter function, to transfer a 3D RGBA image from host to device, but I seem to not being able getting the syntax of the image constructor right. So in this environment import pyopencl as cl import

Installing PyOpenCL on Windows using Intel's SDK and pip

允我心安 提交于 2020-06-17 09:07:09
问题 Following these instructions, I have downloaded and installed Intel's OpenCL™ SDK (Intel® System Studio) from here. The cl.h file is in the folder C:\Program Files (x86)\IntelSWTools\system_studio_2020\OpenCL\sdk\include\CL however when running pip install pyopencl I get the long error message of Building wheel for pyopencl (PEP 517) ... error ERROR: Command errored out with exit status 1: command: 'c:\python38\python.exe' 'c:\python38\lib\site-packages\pip\_vendor\pep517\_in_process.py'

fatal error C1083: Cannot open include file: 'CL/cl.h'

↘锁芯ラ 提交于 2020-06-13 00:11:06
问题 I read all the solutions provided in this website in order to solve this problem, but it still exits. When I run this command in cmd in windows 10 C:\pyopencl-2016.2.1>setup.py install , this error will be shown: c:\pyopencl-2016.2.1\src\c_wrapper\clinfo_ext.h(10) : fatal error C1083: Cannot open include file: 'CL/cl.h': No such file or directory error: command 'C:\\Users\\Neda\\AppData\\Local\\Programs\\Common\\Microsoft\\Visual C++ for Python\\9.0\\VC\\Bin\\amd64\\cl.exe' failed with exit

Anyway to work with Keras in Mac with AMD GPU?

早过忘川 提交于 2020-04-01 16:58:11
问题 I have a MacBook Pro with AMD processor and I want to run Keras (Tensorflow backend) in this GPU. I came to know Keras only works with NVIDIA GPUs. What is the workaround (if possible)? 回答1: You can OpenCL library to overcome this. I have tested it and it is working fine for me. Note: I have python version 3.7 and I will be using pip3 for package installation. Steps: Install OpenCL package with the following command pip3 install pyopencl Install PlaidML library using following command pip3

Anyway to work with Keras in Mac with AMD GPU?

大憨熊 提交于 2020-04-01 16:55:15
问题 I have a MacBook Pro with AMD processor and I want to run Keras (Tensorflow backend) in this GPU. I came to know Keras only works with NVIDIA GPUs. What is the workaround (if possible)? 回答1: You can OpenCL library to overcome this. I have tested it and it is working fine for me. Note: I have python version 3.7 and I will be using pip3 for package installation. Steps: Install OpenCL package with the following command pip3 install pyopencl Install PlaidML library using following command pip3

Time measuring in PyOpenCL

ぐ巨炮叔叔 提交于 2020-01-23 16:46:26
问题 I am running a kernel using PyOpenCL in a FPGA and in a GPU. In order to measure the time it takes to execute I use: t1 = time() event = mykernel(queue, (c_width, c_height), (block_size, block_size), d_c_buf, d_a_buf, d_b_buf, a_width, b_width) event.wait() t2 = time() compute_time = t2-t1 compute_time_e = (event.profile.end-event.profile.start)*1e-9 This provides me the execution time from the point of view of the host (compute_time) and from the device (compute_time_e). The problem is that

Struct Alignment with PyOpenCL

狂风中的少年 提交于 2020-01-06 08:13:24
问题 update: the int4 in my kernel was wrong. I am using pyopencl but am unable to get struct alignment to work correctly. In the code below, which calls the kernel twice, the b value is returned correctly (as 1), but the c value has some "random" value. In other words: I am trying to read two members of a struct. I can read the first but not the second. Why? The same issue occurs whether I use numpy structured arrays or pack with struct. And the _-attribute__ settings in the comments don't help