gpu | 易学教程

How can I use TensorFlow on Windows with AMD GPU?

阅读更多关于 How can I use TensorFlow on Windows with AMD GPU?

问题 I want to use TensorFlow on Windows (Win 10) with a AMD GPU. If I google, there are a lot discussions and sources but I just couldn't figure out what's the best way to do this at the moment. Could someone write a short installation instruction that he thinks is the best and most up-to-date way of doing so? 回答1: Tensorflow officially only supports CUDA, which is a proprietary NVIDIA technology. There is one unofficial implementation using openCL here which could work, or you could try using

GPU utilization mostly 0% during training

阅读更多关于 GPU utilization mostly 0% during training

问题 (GTX 1080, Tensorflow 1.0.0) During the training nvidia-smi output (below) suggests that the GPU utilization is 0% most of the time (despite usage of GPU). Regarding the time I already train, that seems to be the case. Once in a while it peaks up to 100% or similar, for a second though. +-----------------------------------------------------------------------------+ | NVIDIA-SMI 375.26 Driver Version: 375.26 | |-------------------------------+----------------------+----------------------+ |

Low GPU usage by Keras / Tensorflow?

阅读更多关于 Low GPU usage by Keras / Tensorflow?

问题 I'm using keras with tensorflow backend on a computer with a nvidia Tesla K20c GPU. (CUDA 8) I'm tranining a relatively simple Convolutional Neural Network, during training I run the terminal program nvidia-smi to check the GPU use. As you can see in the following output, the GPU utilization commonly shows around 7%-13% My question is: during the CNN training shouldn't the GPU usage be higher? is this a sign of a bad GPU configuration or usage by keras/tensorflow? nvidia-smi output 回答1: Could

Cuda: library nvvm not found

阅读更多关于 Cuda: library nvvm not found

问题 I am trying to run the code below but an error is reported: NvvmSupportError: libNVVM cannot be found. Do conda install cudatoolkit : library nvvm not found My development environment is: Ubuntu 17.04, Spyder/Python3.5 and I have installed via conda (numba and cudatoolkit). Nvidia GPUs (GTX 1070 and GTX 1060). import numpy as np from timeit import default_timer as timer from numba import vectorize @vectorize(["float32(float32, float32)"], target='cuda') def VecADD(a,b): return a+b n =

CPU SIMD vs GPU SIMD?

阅读更多关于 CPU SIMD vs GPU SIMD?

问题 GPU uses the SIMD paradigm, that is, the same portion of code will be executed in parallel, and applied to various elements of a data set. However, CPU also uses SIMD, and provide instruction level parallelism. For example, as far as I know, SSE-like instructions will process data elements with parallelism. While the SIMD paradigm seems to be used differently in GPU and CPU, does GPUs have more SIMD power than CPUs ? In which way the parallel computational capabilities in a CPU are 'weaker'

why do we need cudaDeviceSynchronize(); in kernels with device-printf?

阅读更多关于 why do we need cudaDeviceSynchronize(); in kernels with device-printf?

问题 __global__ void helloCUDA(float f) { printf("Hello thread %d, f=%f\n", threadIdx.x, f); } int main() { helloCUDA<<<1, 5>>>(1.2345f); cudaDeviceSynchronize(); return 0; } Why is cudaDeviceSynchronize(); at many places for example here it is not required after kernel call? 回答1: A kernel launch is asynchronous . This means it returns control to the CPU thread immediately after starting up the GPU process, before the kernel has finished executing. So what is the next thing in the CPU thread here?

Difference between kernels construct and parallel construct

阅读更多关于 Difference between kernels construct and parallel construct

问题 I study a lot of articles and the manual of OpenACC but still i don't understand the main difference of these two constructs. 回答1: kernels directive is the more general case and probably one that you might think of, if you've written GPU (e.g. CUDA) kernels before. kernels simply directs the compiler to work on a piece of code, and produce an arbitrary number of "kernels", of arbitrary "dimensions", to be executed in sequence, to parallelize/offload a particular section of code to the

Difference between kernels construct and parallel construct

阅读更多关于 Difference between kernels construct and parallel construct

Accelerating MATLAB code using GPUs?

阅读更多关于 Accelerating MATLAB code using GPUs?

问题 AccelerEyes announced in December 2012 that it works with Mathworks on the GPU code and has discontinued its product Jacket for MATLAB: http://blog.accelereyes.com/blog/2012/12/12/exciting-updates-from-accelereyes/ Unfortunately they do not sell Jacket licences anymore. As far as I understand, the Jacket GPU Array solution based on ArrayFire was much faster than the gpuArray solution provided by MATLAB. I started working with gpuArray, but I see that many functions are implemented poorly. For

How can I get my java program running on GPU ？How do I change my program can be accelerated？ [closed]

阅读更多关于 How can I get my java program running on GPU ？How do I change my program can be accelerated？ [closed]

问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 3 years ago . I wrote a program which constituted by several class，but the calculation is too slow（Program in bold）, I hope get my java program running on GPU to speed up the computation，or is there another way to speed up the running speed，How do I change my code？ Calculation of the program