gpu

I want to get a Gpu's name on windows operation system with C++

微笑、不失礼 提交于 2019-12-13 02:36:24
问题 I want to get a Gpu's name,for example AMD Radeon HD4830, I want to get information like "ATI Radeon HD4830".But, I read register and get the information like "ATI Radeon HD 4800 Series".And I Used D3D or OPCL's interface get the information also like "ATI Radeon HD 4800 Series".How can I get a Gpu's name correctly? 回答1: I don't remember the exact function you need to call, but you need to use the SetupDiXxx functions. Warning: it's a little painful. 回答2: You can try this with c++amp , if you

How to profile the number of global memory transactions for cuda kernels?

妖精的绣舞 提交于 2019-12-13 01:28:17
问题 How to enable profiling for "uncached_global_load_transaction" counter in cuda command-line profiler? 回答1: The command line profiler is controlled using the following environment variables - COMPUTE_PROFILE: is set to either 1 or 0 (or unset) to enable or disable profiling. COMPUTE_PROFILE_CONFIG: is used to specify a config file for enabling performance counters in the GPU and various other options. COMPUTE_PROFILE_LOG: is set to the desired file path for profiling output. In your case you

Tensorflow: Gradient Calculation with sparse tensors on GPU

穿精又带淫゛_ 提交于 2019-12-12 21:23:05
问题 I built up a tensorflow model similar to the GPU Implementation of CIFAR10. I have a basic model that is executed on every GPU while the variables for the network are on the CPU. Everything works fine as long as I don't use sparse tensors as weight matrices in the layers. My sparse weight matrices are constructed with the function tf.sparse_to_dense() or tf.diag() . When I run it on the CPU everything works fine, but when I run it on the GPU I get the message that there is no GPU

How to select a GPU with CUDA?

浪子不回头ぞ 提交于 2019-12-12 20:13:58
问题 I have a computer with 2 GPUs; I wrote a CUDA C program and I need to tell it somehow that I want to run it on just 1 out of the 2 graphic cards; what is the command I need to type and how should I use it? I believe somehow that is related to the cudaSetDevice but I can't really find out how to use it. 回答1: It should be pretty much clear from documentation of cudaSetDevice, but let me provide following code snippet. bool IsGpuAvailable() { int devicesCount; cudaGetDeviceCount(&devicesCount);

Why does setting an initialization value prevent placing a variable on a GPU in TensorFlow?

北城余情 提交于 2019-12-12 19:51:06
问题 I get an exception when I try to run the following very simple TensorFlow code, although I virtually copied it from the documentation: import tensorflow as tf with tf.device("/gpu:0"): x = tf.Variable(0, name="x") sess = tf.Session() sess.run(x.initializer) # Bombs! The exception is: tensorflow.python.framework.errors.InvalidArgumentError: Cannot assign a device to node 'x': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is

When to use volatile with register/local variables

吃可爱长大的小学妹 提交于 2019-12-12 19:03:14
问题 What is the meaning of declaring register arrays in CUDA with volatile qualifier? When I tried with volatile keyword with a register array, it removed the number of spilled register memory to local memory. (i.e. Force the CUDA to use registers instead of local memory) Is this the intended behavior? I did not find any information about the usage of volatile with regard to register arrays in CUDA documentation. Here is the ptxas -v output for both versions With volatile qualifier __volatile__

Tensorflow: GPU Acceleration only happens after first run

懵懂的女人 提交于 2019-12-12 18:15:39
问题 I've installed CUDA and CUDNN on my machine (Ubuntu 16.04) alongside tensorflow-gpu . Versions used: CUDA 10.0, CUDNN 7.6, Python 3.6, Tensorflow 1.14 This is the output from nvidia-smi , showing the video card configuration. | NVIDIA-SMI 410.78 Driver Version: 410.78 CUDA Version: 10.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute

What does “RuntimeError: CUDA error: device-side assert triggered” in PyTorch mean?

给你一囗甜甜゛ 提交于 2019-12-12 16:15:18
问题 I have seen a lot of specific posts to particular case-specific problems, but no fundamental motivating explanation. What does this error: RuntimeError: CUDA error: device-side assert triggered mean? Specifically, what is the assert that is being triggered, why is the assert there, and how do we work backwards to debug the problem? As-is, this error message is near useless in diagnosing any problem because of the generality that it seems to say "some code somewhere that touches the GPU" has a

Check failed: error == cudaSuccess (2 vs. 0) out of memory

左心房为你撑大大i 提交于 2019-12-12 16:04:30
问题 I am trying to run a neural network with pycaffe on gpu. This works when I call the script for the first time. When I run the same script for the second time, CUDA throws the error in the title. Batch size is 1, image size at this moment is 243x322, the gpu has 8gb RAM. I guess I am missing a command that resets the memory? Thank you very much! EDIT: Maybe I should clarify a few things: I am running caffe on windows. When i call the script with python script.py, the process terminates and the

Place Image on larger canvas size using GPU (possibly CIFilters) without using Image Context

瘦欲@ 提交于 2019-12-12 13:16:11
问题 Let's say I have an Image that's 100x100. I want to place the image onto a larger canvas size that's 500x500. My current approach is to use UIGraphics to create a Context, then draw the image onto the context. UIGraphics.BeginImageContext(....); ImageView.Draw (....); That works great, but it's not as fast as I'd like it to be for what I'm doing. I noticed that CIFilters are extremely fast. Is there a way I can place an image on a larger canvas size using CIFilters, or another method that